The document discusses research being conducted on topic modeling, computational social science, and analyzing sentiment and emotions. It provides an overview of recent works on topic modeling and modeling topic hierarchies. It also discusses research on analyzing aspects and sentiments of online reviews and analyzing social aspects of emotions and self-disclosure behaviors in Twitter conversations.
Analyzing social conversation: a guide to data mining and data visualization Tempero UK
These slides were presented by Mick Conroy of Tempero and Jonathan Stray of Associated Press/Overview Project as part of Social Media Week New York #smwnyc
Tip from IBM Connect 2014: Socialytics = Social Business, Big Social Data and...SocialBiz UserGroup
In this tip, speaker Scott Padget explains how socialytics provides customer and competitive insights as well as real-time operational insights. He introduces the SIFT (Social Intelligence Fusion Toolkit) Solution that funnels big social data into actionable business intelligence. Scott also describes the lifecycle of socialytics and gives a live demo. Obviously, the slides don’t capture the exact live demo, but they do show some screenshot examples of the SIFT Solution in action.
Analyzing social conversation: a guide to data mining and data visualization Tempero UK
These slides were presented by Mick Conroy of Tempero and Jonathan Stray of Associated Press/Overview Project as part of Social Media Week New York #smwnyc
Tip from IBM Connect 2014: Socialytics = Social Business, Big Social Data and...SocialBiz UserGroup
In this tip, speaker Scott Padget explains how socialytics provides customer and competitive insights as well as real-time operational insights. He introduces the SIFT (Social Intelligence Fusion Toolkit) Solution that funnels big social data into actionable business intelligence. Scott also describes the lifecycle of socialytics and gives a live demo. Obviously, the slides don’t capture the exact live demo, but they do show some screenshot examples of the SIFT Solution in action.
Data Driven PR: 8 Steps to Building Media Attention with ResearchWalkerSands
Do you want to learn how your internal data can be used to gain media coverage in The New York Times, USA Today, and Mashable? Or how a simple consumer survey can lead to hundreds of new leads for your business?
Learn how in this presentation from Mike Santoro, President of tech PR firm Walker Sands, and Andrea Kempfer, Director of Marketing at market research firm Lab42.
The recorded presentation can be viewed at: http://www.walkersands.com/Data-Driven-PR-Webinar
Business Models in the Data Economy: A Case Study from the Business Partner D...Boris Otto
Data management seems to experience a renaissance today. One particular trend in the so-called data economy has been the emergence of business models based on the provision of high-quality data. In this context, the paper
examines business models of business partner data providers. The paper explores as to how and why these business models differ. Based on a study of six cases, the paper identifies three different business model patterns. A resource-based view is taken to explore the details of these patterns. Furthermore, the paper develops a set of propositions that help understand why the different business models evolved and how they may develop in the future. Finally, the paper discusses the ongoing market transformation process indicating a shift from traditional value chains toward value networks—a change which, if it is sustainable, would seriously threaten the business models of well-established data providers, such as Dun & Bradstreet, for example.
Matthew Russell's "Unleashing Twitter Data for Fun and Insight" presentation from Strata 2011. Matthew Russell's "Unleashing Twitter Data for Fun and Insight" presentation from Strata 2011. See http://strataconf.com/strata2011/public/schedule/detail/17714 for an overview of the talk.
In this talk we outline some of the key challenges in text analytics, describe some of Endeca's current research work in this area, examine the current state of the text analytics market and explore some of the prospects for the future.
A deck presented at the MRS 'Maximising the Value of Big Data' conference in London, January 2013.
Presents my view of big data and the potential it gives us for mapping the systems that we deal with on a day-to-day basis. Big data holds the promise of providing us with a meta-view of the systems that we all think we are so familiar with. I think we will find that the woods look nothing like the trees.
Learn How a New Kind of Marketing Mix Modeling is Better for Media PlanningThinkVine
This presentation discusses the use of agent-based modeling and its proven advantages to media planners, including the abilities to create effective media plans based on consumer differences, accurately attribute results to media tactics, quantify long-term effects, and forecast sales and ROI results.
This presentation explains how brands can mine social media data, both text and images, in order to find insights about your customers and markets that can provide real business value.
Staying on the Right Side of the Fence when Analyzing Human DataDataSift
Data is all around us and comes from many different sources. This data is generated by human behavior and it’s growing at an astonishing rate. Companies are collecting this data and using it in ways they could have never imagined.
This brings a sense of unease among people that their intimate information is no longer their own. Yet this data is central to companies ability to better serve customers, but it is necessary that companies find the balance and honor customers privacy. How can we strike the balance?
Join this webinar and you will learn:
About the current and future challenges in this data-rich world
How to be a good guy, and still achieve your business objectives while analyzing Human Data
About PYLON for Facebook Topic Data and how you can build insights from Facebook while protecting user privacy
6 steps to triple your social confidence and meet new peopleAdrian Nqld Cahill
6 Steps to triple your social confidence and meet new people. Do we need to say any more? Learn a simple 6 step process to dramatically and naturally become a more sociable person.
This is like, how to win friends and influence people in 2015.
Meet men, meet women, make friends everywhere and enjoy the process. You can do this.
Data Driven PR: 8 Steps to Building Media Attention with ResearchWalkerSands
Do you want to learn how your internal data can be used to gain media coverage in The New York Times, USA Today, and Mashable? Or how a simple consumer survey can lead to hundreds of new leads for your business?
Learn how in this presentation from Mike Santoro, President of tech PR firm Walker Sands, and Andrea Kempfer, Director of Marketing at market research firm Lab42.
The recorded presentation can be viewed at: http://www.walkersands.com/Data-Driven-PR-Webinar
Business Models in the Data Economy: A Case Study from the Business Partner D...Boris Otto
Data management seems to experience a renaissance today. One particular trend in the so-called data economy has been the emergence of business models based on the provision of high-quality data. In this context, the paper
examines business models of business partner data providers. The paper explores as to how and why these business models differ. Based on a study of six cases, the paper identifies three different business model patterns. A resource-based view is taken to explore the details of these patterns. Furthermore, the paper develops a set of propositions that help understand why the different business models evolved and how they may develop in the future. Finally, the paper discusses the ongoing market transformation process indicating a shift from traditional value chains toward value networks—a change which, if it is sustainable, would seriously threaten the business models of well-established data providers, such as Dun & Bradstreet, for example.
Matthew Russell's "Unleashing Twitter Data for Fun and Insight" presentation from Strata 2011. Matthew Russell's "Unleashing Twitter Data for Fun and Insight" presentation from Strata 2011. See http://strataconf.com/strata2011/public/schedule/detail/17714 for an overview of the talk.
In this talk we outline some of the key challenges in text analytics, describe some of Endeca's current research work in this area, examine the current state of the text analytics market and explore some of the prospects for the future.
A deck presented at the MRS 'Maximising the Value of Big Data' conference in London, January 2013.
Presents my view of big data and the potential it gives us for mapping the systems that we deal with on a day-to-day basis. Big data holds the promise of providing us with a meta-view of the systems that we all think we are so familiar with. I think we will find that the woods look nothing like the trees.
Learn How a New Kind of Marketing Mix Modeling is Better for Media PlanningThinkVine
This presentation discusses the use of agent-based modeling and its proven advantages to media planners, including the abilities to create effective media plans based on consumer differences, accurately attribute results to media tactics, quantify long-term effects, and forecast sales and ROI results.
This presentation explains how brands can mine social media data, both text and images, in order to find insights about your customers and markets that can provide real business value.
Staying on the Right Side of the Fence when Analyzing Human DataDataSift
Data is all around us and comes from many different sources. This data is generated by human behavior and it’s growing at an astonishing rate. Companies are collecting this data and using it in ways they could have never imagined.
This brings a sense of unease among people that their intimate information is no longer their own. Yet this data is central to companies ability to better serve customers, but it is necessary that companies find the balance and honor customers privacy. How can we strike the balance?
Join this webinar and you will learn:
About the current and future challenges in this data-rich world
How to be a good guy, and still achieve your business objectives while analyzing Human Data
About PYLON for Facebook Topic Data and how you can build insights from Facebook while protecting user privacy
6 steps to triple your social confidence and meet new peopleAdrian Nqld Cahill
6 Steps to triple your social confidence and meet new people. Do we need to say any more? Learn a simple 6 step process to dramatically and naturally become a more sociable person.
This is like, how to win friends and influence people in 2015.
Meet men, meet women, make friends everywhere and enjoy the process. You can do this.
The success of a counselor hinges on his or her ability to effectively manage relationships with parents and guardians. This session will highlight useful tools for effective parent communication and key techniques to approach difficult conversations through an interactive case study.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Topic and text analysis for sentiment, emotion, and computational social science
1. Topic and Text Analysis for Sentiment, Emotion,
and Computational Social Science
November 2012
Alice Oh
alice.oh@kaist.edu
Users & Information Lab
http://uilab.kaist.ac.kr
1
Thursday, December 6, 2012
2. Overview
• Topic modeling research
• CIKM 2011: Distance-dependent Chinese restaurant franchise (ddCRF)
• ICML 2012: Dirichlet process with random mixed measures (DP-MRM)
• CIKM 2012: Recursive chinese restaurant process for modeling topic
hierarchies (rCRP)
• NIPS Big Learning Workshop 2012: Distributed Online Learning for
Latent Dirichlet Allocation (DoLDA)
• Computational social science research
• WSDM 2011: Aspect sentiment unification model for online review analysis
• ICWSM 2012: Social aspects of emotions in Twitter conversations
• ACL 2012: Self-disclosure and relationship strength in Twitter
conversations
2
Thursday, December 6, 2012
3. Do you feel what I feel?
Social Aspects of Emotions in Twitter Conversations
Suin Kim, JinYeong Bak, Alice Oh
ICWSM 2012
3
Thursday, December 6, 2012
6. Asking Research Questions
Human emotion is typically studied as a within-person, one-direction,
non-repetitive phenomenon; focus has traditionally been on how one
individual feels in reaction to various stimuli at a certain point of
time. But people recognize and inevitably react emotionally and
otherwise to expressions of emotion of other people. We propose
that organizational dyads and groups inhabit emotion cycles:
Emotions of an individual influence the emotions, thoughts and
behaviors of others; others’ reactions can then influence their
future interactions with the individual expressing the original
emotion, as well as that individual’s future emotions and
behaviors. People can mimic the emotions of others, thereby
extending the social presence of a specific emotion, but can also
respond to others’ emotions, extending the range of emotions
present.
5
Thursday, December 6, 2012
7. Social Aspects of Emotions: Motivating Question
How are our emotions affected by others we talk to?
Thursday, December 6, 2012
8. Social Aspects of Emotions: Research Questions
• How do we communicate our emotions?
• Use a topic model on Twitter conversations to discover the “topics” that
represent the eight emotions
• Analyze the proportions of the total tweets for the emotions
• How do we influence other people’s emotions?
• Analyze the and emotion transitions of the tweets
• Look for topics that change the emotions of the conversation partners
• Find interesting patterns of emotion pairs
Thursday, December 6, 2012
9. Social Aspects of Emotions: Data
• Twitter conversation data: approx 220k dyads who “reply” to each other,
1,670k conversational chains
!
"!
#!
$!
%!
Thursday, December 6, 2012
10. Seed Words (We Feel Fine by Harris & Kamvar)
anticipation
hope
wait
await
inspir
excit
bore
readi
expect
nervou
calm
motiv
prepar
certain
anxiou
optimist
forese
joy
awesom
amaz
wonder
excit
glad
fine
beauti
high
lucki
super
perfect
complet
special
bless
safe
proud
anger
shit
bitch
ass
mean
damn
mad
jealou
piss
annoi
angri
upset
moron
rage
screw
stuck
irrit
surprise
amaz
wow
wonder
weird
lucki
differ
awkward
confus
holi
strang
shock
odd
embarrass
overwhelm
astound
astonish
fear
scare
stress
horror
nervou
terror
alarm
behind
panic
fear
afraid
desper
threaten
tens
terrifi
fright
anxiou
sadness
sorri
bad
aw
sad
wrong
hurt
blue
dead
lost
crush
weak
depress
wors
low
terribl
lone
disgust
sick
wrong
evil
fat
ugli
horribl
gross
terribl
selfish
miser
pathet
disgust
worthless
aw
asham
fuck
acceptance
okai
ok
same
alright
safe
lazi
relax
peac
content
normal
secur
complet
numb
fulfil
comfort
defeat
Thursday, December 6, 2012
11. Dirichlet Forest Prior
• Dirichlet Forest Prior (Andrzejewski et al.)
• Mixture of Dirichlet tree distribution
• Dirichlet tree: Generalization of Dirichlet distribution
• Knowledge is expressed using Must-link and Cannot-link
primitives
• Must-link (love, sweetheart)
• Cannot-link (exciting, bored)
10
DF-LDA
Thursday, December 6, 2012
12. Dirichlet Forest Prior
• Dirichlet Forest Prior (Andrzejewski et al.)
• Mixture of Dirichlet tree distribution
• Dirichlet tree: Generalization of Dirichlet distribution
• Knowledge is expressed using Must-link and Cannot-link
primitives
• Must-link (love, sweetheart)
• Cannot-link (exciting, bored)
10
q
β
η
DF-LDA
Thursday, December 6, 2012
13. Domain Knowledge in Dirichlet Forest Prior
11
Seed Words
anticipation
hope
wait
await
inspir
excit
bore
readi
expect
nervou
calm
motiv
prepar
certain
anxiou
optimist
forese
joy
awesom
amaz
wonder
excit
glad
fine
beauti
high
lucki
super
perfect
complet
special
bless
safe
proud
anger
shit
bitch
ass
mean
damn
mad
jealou
piss
annoi
angri
upset
moron
rage
screw
stuck
irrit
surprise
amaz
wow
wonder
weird
lucki
differ
awkward
confus
holi
strang
shock
odd
embarrass
overwhelm
astound
astonish
fear
scare
stress
horror
nervou
terror
alarm
behind
panic
fear
afraid
desper
threaten
tens
terrifi
fright
anxiou
sadness
sorri
bad
aw
sad
wrong
hurt
blue
dead
lost
crush
weak
depress
wors
low
terribl
lone
disgust
sick
wrong
evil
fat
ugli
horribl
gross
terribl
selfish
miser
pathet
disgust
worthless
aw
asham
fuck
acceptance
okai
ok
same
alright
safe
lazi
relax
peac
content
normal
secur
complet
numb
fulfil
comfort
defeat
Must-link within a class Cannot-link between classes
Thursday, December 6, 2012
14. Dirichlet Forest vs. Dirichlet
12
Fear
DF-LDA don’t think but know why even wanna care worry understand
Fear
LDA good exam lol luck just school haha i’m xx worry tomorrow
Surprise
DF-LDA that very really cool wow wonder just some differ amazing
Surprise
LDA just rt holy got thank did shit new love lol awesome buy oh
Sadness
DF-LDA bad my real feel life aw sad kill lost dead hurt wrong sick
Sadness
LDA lol just know sorry isn’t oh tweet did haha don’t thought think
Thursday, December 6, 2012
15. Emotion Topics How do we express emotions?
JoyAnticipation Anger
Topic 114
omg
love
haha
thank
really
Topic 107
love
thank
follow
wow
Topic 159
good
day
hope
morning
thank
Topic 158
love
thank
miss
hug
Topic 125
hope
better
feel
thank
soon
Topic 26
good
thank
hope
miss
Topic 146
come
wait
week
day
june
Topic 146
good
day
time
work
Topic 131
lmao
fuck
ass
bitch
shit
Topic 4
ass
yo
lmao
nigga
Topic 19
lmao
shit
damn
fuck
oh
Topic 13
shit
nigga
smh
yea
Fear
Topic 48
omg
oh
lmao
shit
scare
Topic 78
happen
heart
attack
hospital
Topic 27
don’t
come
night
sleep
outside
Topic 140
time
got
work
day
Surprise
Topic 172
yeag
know
think
true
funny
Topic 89
know
don’t
think
look
Topic 15
think
don’t
know
make
really
Topic 94
haha
dont
think
really
29 70 21 14 5
Sadness Disgust
Topic 6
oh
sorry
haha
know
didnt
Topic 59
hurt
got
good
bad
pain
Topic 106
tweet
reply
didn’t
read
sorry
Topic 155
oh
really
make
feel
Topic 116
oh
fuck
don’t
ye
ew
Topic 116
look
haha
oh
know
Topic 22
don’t
oh
think
yeah
lmao
Topic 174
don’t
think
say
people
Acceptance
Topic 43
ok
oh
thank
cool
okay
Topic 102
know
try
let
ok
Topic 199
xx
thank
good
okay
follow
Topic 8
night
love
good
sleep
17 7 18 Neutral
Topic 180
com
www
http
check
youtube
Topic 156
twitter
facebook
people
account
Topic 184
account
google
app
work
email
Topic 67
food
chicken
cook
rt
19
13
Thursday, December 6, 2012
16. Emotion Topics How do we express emotions?
JoyAnticipation
Topic 114
omg
love
haha
thank
really
Topic 107
love
thank
follow
wow
Topic 125
hope
better
feel
thank
soon
Topic 26
good
thank
hope
miss
Sadness
Topic 6
oh
sorry
know
didnt
Topic 59
hurt
got
good
bad
pain
Neutral
Topic 180
com
www
http
check
youtube
Topic 156
twitter
facebook
people
account
GreetingCaring Sympathy IT/Tech
14
Thursday, December 6, 2012
18. Defining “Influence”
User A
User B
Having a tough day
today. RIP Harrison. I’ll
miss you a ton :/
Just pray about it.
God will help you.
Not really religious,
but thanks man. :)
If you need talk
you know I’m here.
Time
(Sadness)
(Acceptance)
(Anticipation)
16
Thursday, December 6, 2012
19. Defining “Influence”
emotion influencing tweet
User A
User B
Having a tough day
today. RIP Harrison. I’ll
miss you a ton :/
Just pray about it.
God will help you.
Not really religious,
but thanks man. :)
If you need talk
you know I’m here.
Time
(Sadness)
(Acceptance)
(Anticipation)
16
Thursday, December 6, 2012
20. Topic 117
tweet
people
don’t
read
post
Topic 59
hurt
got
bad
pain
feel
Emotion Influences What can you say to make your
partner feel better?
Joy → SadnessSadness → Joy
Topic 18
wear
look
think
love
black
Topic 24
love
thank
great
new
look
Acceptance → Anger
Topic 31
i’m
got
lmax
shit
da
Topic 13
lmao
shit
nigga
smh
yea
Greeting
Sympathizing
Swearing Complaining
17
Thursday, December 6, 2012
21. 0
0.075
0.15
0.225
0.3
Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral
0.041
0.0710.082
0.053
0.265
0.061
0.081
0.0420.051
Emotion Influence: Sadness to Joy
Emotion Influence: Joy to Anger
0
0.1
0.2
0.3
0.4
Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral
0.211
0.230.2140.209
0.191
0.2370.253
0.358
0.273
Expressing Anger has 26.5% of chance
of changing the partner’s emotion from
Joy to Anger.
18
Expressing Joy has 35.8% of chance of changing
the partner’s emotion from Sadness to Joy.
Thursday, December 6, 2012
22. Outliers
19
A: Sorry to hear about your bags.
If you would like us to get
someone to contact you DM us
your reference and contact
number.
B: it's on it's way to manch. If the
woman on the check in desk in
Miami hadn't been trying
to be all smart! Been no problem.
A: Sorry about that. Pleased to
hear they located it quickly for you
though.
B: mistakes happen.
Thursday, December 6, 2012
23. Analyzing Self-Disclosure Behaviors in
Twitter Conversations Using Text Mining
Techniques (Presented at ACL 2012)
JinYeong Bak, Suin Kim, Alice Oh
{jy.bak, suin.kim}@kaist.ac.kr, alice.oh@kaist.edu
Department of Computer Science, KAIST
Thursday, December 6, 2012
24. 2012-07-11
In social psychology
} Degree of self-disclosure in a relationship depends on
the strength of the relationship
} Strategic self-disclosure can strengthen the relationship
Introduction
21
I like you
too!
You’re my
best
friend!
Thursday, December 6, 2012
25. 2012-07-11
Hypothesis
22
Twitter conversations also show a similar pattern
} Dyads with high relationship strength show more self-disclosure
behavior
} Dyads with low relationship strength show less self-disclosure
behavior
I like you
too!
You’re my
best
friend!
Hello~
Hi
Thursday, December 6, 2012
26. 2012-07-11
Methodology
} Twitter Data
} 131K users
} 2M conversations
} Relationship Strength
} Chain frequency (CF)
} Chain length (CL)
} Self-Disclosure
} Personal information
} Open communication
} Profanity
} Analysis with Topic Models
} Latent Dirichlet allocation (LDA, [Blei, JMLR 2003])
} Aspect and sentiment unification model (ASUM, [Jo,WSDM 2011])
23
Thursday, December 6, 2012
27. 2012-07-11
Twitter Conversation
} A Twitter conversation chain
} 3 or more tweets
} at least one reply by each user
} Our Twitter conversation data
} Oct 2011 to Dec 2011
} 131K users
} 2M chains
} 11M tweets
24
https://twitter.com/#!/britneyspears
Example of a conversation chain
Thursday, December 6, 2012
28. 2012-07-11
Relationship Strength
} Social psychology literature states relationship strength can be
measured by communication frequency and length [Granovetter, 1973;
Levin and Cross, 2004]
} CF: chain frequency
} The number of conversational chains between the dyad
averaged per month
} CL: chain length
} The length of conversational chains between the dyad
averaged per month
} Relationship strength
} A high CF or CL for a dyad means the relationship is strong
} A low CF or CL for a dyad means the relationship is weak
25
Thursday, December 6, 2012
29. 2012-07-11
Self-Disclosure
} Open communication - Openness
} Negative openness
} Nonverbal openness
} Emotional openness
} Receptive openness – difficult to find in tweets
} General-style openness – not clearly defined in the literature
} Personal Information
} Personally Identifiable Information (PII)
} Personally Embarrassing Information (PEI)
} Profanity
} nigga, ass, wtf, lmao
26
Thursday, December 6, 2012
30. 2012-07-11
Negative openness
} Method
} We use ASUM with emoticons as seed words
[ “Aspect and sentiment unification model for online review analysis”, Jo,WSDM’11]
} ASUM is LDA-based joint model of topic and sentiment
} ASUM takes unannotated data and classifies each sentence (tweet) as
positive/negative/neutral
Self-Disclosure - Openness
27
Thursday, December 6, 2012
31. 2012-07-11
Self-Disclosure - Openness
Nonverbal openness
} Method
} We look for emoticons,‘lol’,‘xxx’
} Emoticons are like facial expressions -- :) :( :P
} ‘lol’ (laughing out loud) and ‘xxx’ (kisses) are very frequently used in a
similar manner to nonverbal openness
28
Thursday, December 6, 2012
32. 2012-07-11
Self-Disclosure - Openness
Emotional openness
} Method
} Look for tweets that contain common expressions of feeling words
[We feel fine (Harris, J, 2009)]
29
Thursday, December 6, 2012
33. 2012-07-11
Self-Disclosure – Personal Information
Personally Identifiable Information (PII)
Personally Embarrassing Information (PEI)
30
Ex) name, location,
email address, job,
social security number
Ex) clinical history,
sexual life,
job loss,
family problem
Thursday, December 6, 2012
35. 2012-07-11
Self-Disclosure – Personal Information
Example of PII, PEI and Profanity topics
} Shown by high probability words in each topic
PII 1 PII 2 PEI 1 PEI 2 PEI 3 Profanity
san tonight pants teeth family nigga
live time wear doctor brother lmao
state tomorrow boobs dr sister shit
texas good naked dentist uncle ass
south ill wearing tooth cousin bitch
32
Thursday, December 6, 2012
39. 2012-07-11
Results: Interpretation
} Emotional openness
} When they are not very close, they express frequent encouragements,
or polite reactions to baby or pets
36
Thursday, December 6, 2012
42. Distributed Online Learning for
Latent Dirichlet Allocation
JinYeong Bak, Dongwoo Kim, and Alice Oh
NIPS 2012
Workshop on Big Learning
39
Thursday, December 6, 2012
43. Motivation
• Problem 1: Inference for LDA takes a long time
• Problem 2: Continuously expanding corpus necessitates continuous updates
of model parameters
• But updating of model parameters is not possible with plain LDA
• Must re-train with the entire updated corpus
• Solution to 1: Distributed inference shortens inference time (Newman
JMLR 2009, Wang WWW 2012)
• Solution to 2: Online (batch) learning enables updates to model
parameters (Hoffman NIPS 2010)
• Our Approach: Combine distributed inference and online learning
40
Thursday, December 6, 2012
44. Distributed Online LDA
• Based on variational inference
• Mini-batch updates via stochastic learning (variational EM)
• Distribute variational EM using MapReduce
41
Thursday, December 6, 2012
45. Experimental Setup
• Data: 5.1M Twitter conversations
• 4.8M English Wikipedia articles
• 60 node Hadoop system
• Each node with 8 x 2.30GHz cores
42
Thursday, December 6, 2012
46. Wikipedia Results
43
Topic 0 Topic 22 Topic 42 Topic 65 Topic 94 Topic 170 Topic 232
relativity
physics
einstein
quantum
gravity
channel
television
tv
cable
news
milk
chocolate
sugar
food
cream
god
bible
moses
chapter
genesis
party
election
president
member
elected
season
team
league
game
football
album
song
band
music
released
Minibatch oLDA DoLDA Speedup
16,384 238666.25 47994.03 4.97
32,768 188508.71 33470.03 5.63
65,536 206290.27 26788.53 7.70
Thursday, December 6, 2012
47. Twitter Temporal Patterns of Topics
44
Conversation b1 on November 2, 2010
A I wish I could vote today, but I have to work for 14 hours
B is it legal for them not to give you time off to vote?
A probably
Conversation b2 on March 31, 2012
A Mitt Romney: "Obama should release the notes and transcripts of
all his meetings with world leaders"
B Why is he being held to higher standard than any other president.
A did you see my Santorum 'slip' tweet? Is the media afraid to
comment on it?
B oh yes I did. I saw it mentioned yesterday also. disgusting and he
should be raked over hot coals for it.
0.005
0.010
0.015
10−10 11−01 11−04 11−07 11−10 12−01 12−04
Day
Documentproportion
0.004
0.006
0.008
0.010
0.012
11−07 11−10 12−01
Day
Documentproportion
Conversation c1 on September 5, 2011
A Oh god, miss Waite ran over to me up the school just now! :L on
the plus subjects are now picked! :D
B what did you pick??
A english, RE, art and psychology! :) was unsure between history
and psych but found out bubbles was teaching it so nooo! :L
Conversation c2 on October 12, 2011
A :) My day's been okay! It feels long! But school' was okayish. I
hope you have an awesome day! :D
B that's good then! Ahh hope it's not cause anything bad happened?
Thanks! Have a great sleep :)
A no! Class was just boring lol and thanks! :) i will! Even though i
have to wake up early tomorrow for a midterm! :S
<Topic words: party vote people politics obama>
<Topic words: school mate class teacher grade>
Thursday, December 6, 2012
48. CAVEAT
45
Big Data, social media data, do not always get the right answers!
They contain much noise and much bias.
Sentiment analysis is also full of problems at the big data-level
because every small assumption can turn out to cause wide swings
in the final interpretation of the data.
They are valuable because they have opened up possibilities for
analyses of naturally-occurring data in huge amounts.
We need better methods and tools that are tailored for social media.
We need to ask the right questions that can be answered well despite
the biases of the social media data.
Thursday, December 6, 2012
49. For details, visit our webpage:
http://uilab.kaist.ac.kr
Or email me:
alice.oh@kaist.edu
Thursday, December 6, 2012