SlideShare a Scribd company logo
1 of 26
Presented by Vidar Brekke,
                                Social Intent LLC




SOCIAL TEXT
ANALYTICS FOR
ENTERPRISE AND
CONSUMER
APPLICATIONS
The International Association of Software
Architects. October 23, 2012




                                  @ividar #nlproc
What is Text Analytics?




                          Processes that uncover
                          business value in
          A               unstructured text via the
                          application of statistical,
      B
                          linguistic, machine
              C           learning, and data analysis
                          and visualization
                          techniques


                                              @ividar #nlproc   2
Text analytics help answer
business questions faster and
cheaper than before, uncovering
new, hidden insights!




                          @ividar #nlproc   3
Text analytics is a Big Data problem




 Volume Velocity Variety
                                                        Hundreds of
                                                        languages
                     Social media,
                     help inquiries,
                     email, texts,
                     surveys


  10.2 Million
  tweets sent                                                  Cryptic (vertical
  during the first                     Formal, inform          industry or
  presidential                         al or                   criminal activity)
  debate                               ridiculously
                                       informal




                                                                         @ividar #nlproc   4
I’m So Intextuated With You




                              Unstructured text represents the
                              biggest opportunity and problem
                              in Big Data

                              Text, as opposed to most other
                              enterprise data, it’s very dirty
                              data




                                                     @ividar #nlproc   5
Correlating consumer confidence with mentions of “jobs” on
Twitter




                                                    @ividar #nlproc   6
Yay! Steve Jobs launches a new iPhone!




                                         @ividar #nlproc   7
You can trade on Twitter




            @ividar #nlproc   8
Low Signal/Noise Ratio + Naïve Metrics Lead to Wrong Conclusions




                    •   Lack of relevance: Many conversations you think
                        are about you, aren’t.

                    •   Poor accuracy: Many automated sentiment
                        solutions are as good as a coin flip.

                    •   Generic: All analysis is applied the same way
                        across domains

                    •   Language Evolves: Slang, sarcasm is rampant in
                        social media. Dictionary-based approaches are
                        largely ineffective.




                                                             @ividar #nlproc   9
Relevancy: It’s not all about you.



    Let me finish my drink before you drive me to the
    Betty Ford clinic!

    Call me a bigot, but white guys can’t sprint!
    #london2012

    My husband is such a baby. He won’t even taste raw
    food.

    Is Delta’s food prepared by Purina? So much for first
    class.

                                                    @ividar #nlproc   10
Search and Destroy (the data you’re looking for)



    Text analytics got traction in the 80s, but the use-cases
    were different than today.

    “Word spotting” – not different from a Google search.

             Show me all documents containing:
                   Ford NOT Harrison


               But it doesn’t scale

                                                     @ividar #nlproc   11
Booleans are like woodcarving with a chainsaw



   Query: Ford NOT Harrison ….

                      …would miss this tweet



   Carguy231: Me and a dozen others
   have lined up outside the Harrison, NY
   Ford dealership to test drive the new
   Fusion!



                                                @ividar #nlproc   12
Booleans are like woodcarving with a chainsaw



   Query: Ford AND Fusion….

                      …would get this tweet



   Roadrunner123: Stuck with my dad in
   his ford listening to horrible jazz fusion




                                                @ividar #nlproc   13
Sentiment Analysis



          Early sentiment analysis tools also use word spotting.


                              “Awesome” = good

                                 “Sucks” = bad


                     What about sarcasm, slang, new words?

   Additionally, the analysis is typically on overall contextual polarity, rather
   than targeted.


        “I love the new Camaro, it’s better than the Mustang”



                                                                         @ividar #nlproc   14
You can’t use word spotting for sentiment detection



   “It took all morning to sign the lease papers for my new Mustang!”

     “I stood on line all morning to get the last Mustang on the lot!”



     “The brakes on the Mustang are surprisingly unpredictable.”

     “The TV ads for the Mustang are surprisingly unpredictable!”



                  “The Mustang has never been good”

               “The Mustang has never been this good”

                                                               @ividar #nlproc   15
Nu-School text analytics is based on Machine Learning



   Using training-data to help the system to recognize patterns. We
   develop a statistical probability that a sentence is
   positive, negative, etc.

   What are training data?
   These are samples of text annotated by humans in an effort to
   show the machine what the right answer is

             “I love my iPhone, but hate AT&T”

                 | iPhone | Positive | AT&T | Negative


       Much easier and quicker to develop new languages than
                    dictionary based approaches

                                                           @ividar #nlproc   16
Test: What’s the sentiment here?




                                   “Reuters reports that
                                   Assad continues the
                                   massacre of his own
                                   people amid sanctions
                                   from the international
                                   community.”




                                                   @ividar #nlproc   17
How to evaluate a text analytics platform



   The accuracy of a sentiment analysis system is, in
   principle, how well it agrees with human judgments.



    “I can’t believe the bar has a hidden gambling room in
                            the back!”



   An automated system can never be better than
   humans. Or can it?



                                                   @ividar #nlproc   18
Using Human Parallel Coding to Establish Gold Standards




        Confusion Matrix: Human as Gold Standard


             POSITIVE   NEGATIVE   NEUTRAL         TOTAL
  POSITIVE     365        24         159           548
  NEGATIVE     57         81          65           203         Raw Accuracy:
                                                                  61.5%
  NEUTRAL      274        60         415           749
    TOTAL      696        165        639           1500



    If human agrees with a machine around 60% percent of the time, the
    machine would be performing as well as a human being.


                                                               @ividar #nlproc   19
Using A Credit Matrix to Create Improved Measurement




            POSITIVE    NEGATIVE   NEUTRAL
 POSITIVE    100%         0%           50%
NEGATIVE      0%         100%          50%         Credit Matrix
 NEUTRAL     50%          50%       100%


                                                     Partial Credit Figure of Merit:
                                                     82.3%

                                        POSITIVE    NEGATIVE       NEUTRAL
    Confusion Matrix:       POSITIVE         365        24           159
    Human 1 as Gold         NEGATIVE         57         81           65
    Standard
                            NEUTRAL          274        60           415


                                                                     @ividar #nlproc   20
Precision & Recall (sentiment as an example)



   Precision is the fraction of retrieved instances
   that are relevant
   E.g. How many instances labeled as positive, were
   actually positive

   Recall is the fraction of relevant instances that are
   retrieved
   E.g. How many positive instances the system
   detected compared to all positive instances.




                                                 @ividar #nlproc   21
Top business applications of text/content analytics*

                                                                             *Alta Plana, 2011

   •   Brand / product / reputation management
        • Market research and social media monitoring, i.e. what are people saying
          about my brand or products

   •   Voice of the Customer / Customer Experience Management
        • Do I need to step in and offer customer service?
        • How many people recommend my brand vs. advocate against it?

   •   Search, Information Access, or Questions Answering
        • Which bloggers are negative toward Obamacare?
        • Which of the hotels on Yelp.com get great reviews for the room service?
        • What are some articles similar to this one?

   •   Competitive intelligence
        • What competing products are people considering and why
        • Are competitor’s media spend generating purchase intent?


                                                                         @ividar #nlproc         22
Growing areas for is text analytics being applied




                        Product development

          Intelligence and counter-terrorism, law enforcement

                    Pharmaceutical drug discovery

                   Financial services and insurance

                   Media, publishing & advertising

                          Political research

                                 CRM


                                                            @ividar #nlproc   23
Still awake?




   There is money in text analytics.

   Here’s a stock tip worth the price of admission
   alone

   (YMMV….)


                                            @ividar #nlproc   24
Strange Bedfellows




  Whenever Anne Hathaway's
  name appeared with any
  regularity in news
  stories, Berkshire Hathaway A
  shares rose in value.




                                  @ividar #nlproc   25
Thx & txt u l8tr

                            Vidar Brekke
                   vidar@socialintent.com
                                  @ividar




                                   @ividar #nlproc

More Related Content

Similar to SOCIAL TEXT ANALYTICS FOR ENTERPRISE AND CONSUMER APPLICATIONS

#1NWebinar: Digital on the Runway
#1NWebinar: Digital on the Runway#1NWebinar: Digital on the Runway
#1NWebinar: Digital on the RunwayOne North
 
Machine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our WorldMachine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our WorldKen Tabor
 
Greenfield Effect: Patterns for Effective Disaster Delivery
Greenfield Effect: Patterns for Effective Disaster DeliveryGreenfield Effect: Patterns for Effective Disaster Delivery
Greenfield Effect: Patterns for Effective Disaster DeliveryJulian Warszawski
 
Moving beyond Vulnerability Testing
Moving beyond Vulnerability TestingMoving beyond Vulnerability Testing
Moving beyond Vulnerability TestingCapgemini
 
Social Search: A Little Help From My Friends
Social Search: A Little Help From My FriendsSocial Search: A Little Help From My Friends
Social Search: A Little Help From My FriendsBrynn Evans
 
Are You Listening? Real time data and social media
Are You Listening? Real time data and social mediaAre You Listening? Real time data and social media
Are You Listening? Real time data and social mediaAndrew Walker
 
Are you listening? Real Time Measurement and Monitoring
Are you listening? Real Time Measurement and MonitoringAre you listening? Real Time Measurement and Monitoring
Are you listening? Real Time Measurement and MonitoringKlaxon
 
How to Build Your Future in the Internet of Things Economy. Jennifer Riggins
How to Build Your Future in the Internet of Things Economy. Jennifer RigginsHow to Build Your Future in the Internet of Things Economy. Jennifer Riggins
How to Build Your Future in the Internet of Things Economy. Jennifer RigginsFuture Insights
 
AI, Machine Learning, and their Application for Growth - #GHConf18
AI, Machine Learning, and their Application for Growth - #GHConf18AI, Machine Learning, and their Application for Growth - #GHConf18
AI, Machine Learning, and their Application for Growth - #GHConf18GrowthHackers
 
AI and ChatGPT in Online Education
AI and ChatGPT in Online Education AI and ChatGPT in Online Education
AI and ChatGPT in Online Education D2L Barry
 
[DSC Europe 23] Shahab Anbarjafari - Generative AI: Impact of Responsible AI
[DSC Europe 23] Shahab Anbarjafari - Generative AI: Impact of Responsible AI[DSC Europe 23] Shahab Anbarjafari - Generative AI: Impact of Responsible AI
[DSC Europe 23] Shahab Anbarjafari - Generative AI: Impact of Responsible AIDataScienceConferenc1
 
Ethical Artificial Intelligence
Ethical Artificial IntelligenceEthical Artificial Intelligence
Ethical Artificial IntelligenceRudradeb Mitra
 
Artificial Intelligence 101: What is It and Why is it Suddenly a Big Deal Again?
Artificial Intelligence 101: What is It and Why is it Suddenly a Big Deal Again?Artificial Intelligence 101: What is It and Why is it Suddenly a Big Deal Again?
Artificial Intelligence 101: What is It and Why is it Suddenly a Big Deal Again?Andrew Ferrier
 
The Need for Deep Learning Transparency
The Need for Deep Learning TransparencyThe Need for Deep Learning Transparency
The Need for Deep Learning Transparencyinside-BigData.com
 
Bigger than Any One: Solving Large Scale Data Problems with People and Machines
Bigger than Any One: Solving Large Scale Data Problems with People and MachinesBigger than Any One: Solving Large Scale Data Problems with People and Machines
Bigger than Any One: Solving Large Scale Data Problems with People and MachinesTyler Bell
 
Another Day In Paradise
Another Day In ParadiseAnother Day In Paradise
Another Day In Paradisekum72
 
Ar design reality2018
Ar design reality2018Ar design reality2018
Ar design reality2018Anselm Hook
 
Using Data for Decisions TechinAsia Singapore 2015
Using Data for Decisions TechinAsia Singapore 2015Using Data for Decisions TechinAsia Singapore 2015
Using Data for Decisions TechinAsia Singapore 2015Eli Schwartz
 
Machines are the new Digital Natives
Machines are the new Digital NativesMachines are the new Digital Natives
Machines are the new Digital NativesMiel Vander Sande
 
Biting into the Jawbreaker: Pushing the Boundaries of Threat Hunting Automation
Biting into the Jawbreaker: Pushing the Boundaries of Threat Hunting AutomationBiting into the Jawbreaker: Pushing the Boundaries of Threat Hunting Automation
Biting into the Jawbreaker: Pushing the Boundaries of Threat Hunting AutomationAlex Pinto
 

Similar to SOCIAL TEXT ANALYTICS FOR ENTERPRISE AND CONSUMER APPLICATIONS (20)

#1NWebinar: Digital on the Runway
#1NWebinar: Digital on the Runway#1NWebinar: Digital on the Runway
#1NWebinar: Digital on the Runway
 
Machine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our WorldMachine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our World
 
Greenfield Effect: Patterns for Effective Disaster Delivery
Greenfield Effect: Patterns for Effective Disaster DeliveryGreenfield Effect: Patterns for Effective Disaster Delivery
Greenfield Effect: Patterns for Effective Disaster Delivery
 
Moving beyond Vulnerability Testing
Moving beyond Vulnerability TestingMoving beyond Vulnerability Testing
Moving beyond Vulnerability Testing
 
Social Search: A Little Help From My Friends
Social Search: A Little Help From My FriendsSocial Search: A Little Help From My Friends
Social Search: A Little Help From My Friends
 
Are You Listening? Real time data and social media
Are You Listening? Real time data and social mediaAre You Listening? Real time data and social media
Are You Listening? Real time data and social media
 
Are you listening? Real Time Measurement and Monitoring
Are you listening? Real Time Measurement and MonitoringAre you listening? Real Time Measurement and Monitoring
Are you listening? Real Time Measurement and Monitoring
 
How to Build Your Future in the Internet of Things Economy. Jennifer Riggins
How to Build Your Future in the Internet of Things Economy. Jennifer RigginsHow to Build Your Future in the Internet of Things Economy. Jennifer Riggins
How to Build Your Future in the Internet of Things Economy. Jennifer Riggins
 
AI, Machine Learning, and their Application for Growth - #GHConf18
AI, Machine Learning, and their Application for Growth - #GHConf18AI, Machine Learning, and their Application for Growth - #GHConf18
AI, Machine Learning, and their Application for Growth - #GHConf18
 
AI and ChatGPT in Online Education
AI and ChatGPT in Online Education AI and ChatGPT in Online Education
AI and ChatGPT in Online Education
 
[DSC Europe 23] Shahab Anbarjafari - Generative AI: Impact of Responsible AI
[DSC Europe 23] Shahab Anbarjafari - Generative AI: Impact of Responsible AI[DSC Europe 23] Shahab Anbarjafari - Generative AI: Impact of Responsible AI
[DSC Europe 23] Shahab Anbarjafari - Generative AI: Impact of Responsible AI
 
Ethical Artificial Intelligence
Ethical Artificial IntelligenceEthical Artificial Intelligence
Ethical Artificial Intelligence
 
Artificial Intelligence 101: What is It and Why is it Suddenly a Big Deal Again?
Artificial Intelligence 101: What is It and Why is it Suddenly a Big Deal Again?Artificial Intelligence 101: What is It and Why is it Suddenly a Big Deal Again?
Artificial Intelligence 101: What is It and Why is it Suddenly a Big Deal Again?
 
The Need for Deep Learning Transparency
The Need for Deep Learning TransparencyThe Need for Deep Learning Transparency
The Need for Deep Learning Transparency
 
Bigger than Any One: Solving Large Scale Data Problems with People and Machines
Bigger than Any One: Solving Large Scale Data Problems with People and MachinesBigger than Any One: Solving Large Scale Data Problems with People and Machines
Bigger than Any One: Solving Large Scale Data Problems with People and Machines
 
Another Day In Paradise
Another Day In ParadiseAnother Day In Paradise
Another Day In Paradise
 
Ar design reality2018
Ar design reality2018Ar design reality2018
Ar design reality2018
 
Using Data for Decisions TechinAsia Singapore 2015
Using Data for Decisions TechinAsia Singapore 2015Using Data for Decisions TechinAsia Singapore 2015
Using Data for Decisions TechinAsia Singapore 2015
 
Machines are the new Digital Natives
Machines are the new Digital NativesMachines are the new Digital Natives
Machines are the new Digital Natives
 
Biting into the Jawbreaker: Pushing the Boundaries of Threat Hunting Automation
Biting into the Jawbreaker: Pushing the Boundaries of Threat Hunting AutomationBiting into the Jawbreaker: Pushing the Boundaries of Threat Hunting Automation
Biting into the Jawbreaker: Pushing the Boundaries of Threat Hunting Automation
 

More from Meddle

Social Selling for Inside Sales Teams
Social Selling for Inside Sales TeamsSocial Selling for Inside Sales Teams
Social Selling for Inside Sales TeamsMeddle
 
Employee-powered Content Marketing for Enterprises
Employee-powered Content Marketing for EnterprisesEmployee-powered Content Marketing for Enterprises
Employee-powered Content Marketing for EnterprisesMeddle
 
Understanding the potential of the Facebook Open Graph and Graph API
Understanding the potential of the Facebook Open Graph and Graph APIUnderstanding the potential of the Facebook Open Graph and Graph API
Understanding the potential of the Facebook Open Graph and Graph APIMeddle
 
Understanding the Open Graph
Understanding the Open GraphUnderstanding the Open Graph
Understanding the Open GraphMeddle
 
Getting Started With Social Media Technologies
Getting Started With Social Media TechnologiesGetting Started With Social Media Technologies
Getting Started With Social Media TechnologiesMeddle
 
Social Media for Business - Presentation for Outsourcing Institute
Social Media for Business - Presentation for Outsourcing InstituteSocial Media for Business - Presentation for Outsourcing Institute
Social Media for Business - Presentation for Outsourcing InstituteMeddle
 
Facebook Pages 101
Facebook Pages 101Facebook Pages 101
Facebook Pages 101Meddle
 
Crowdsourcing 101 - tapping into the wisdom of crowds
Crowdsourcing 101 - tapping into the wisdom of crowdsCrowdsourcing 101 - tapping into the wisdom of crowds
Crowdsourcing 101 - tapping into the wisdom of crowdsMeddle
 
Social Apps 101
Social Apps 101Social Apps 101
Social Apps 101Meddle
 
Brands Can Make Friends Too
Brands Can Make Friends TooBrands Can Make Friends Too
Brands Can Make Friends TooMeddle
 

More from Meddle (10)

Social Selling for Inside Sales Teams
Social Selling for Inside Sales TeamsSocial Selling for Inside Sales Teams
Social Selling for Inside Sales Teams
 
Employee-powered Content Marketing for Enterprises
Employee-powered Content Marketing for EnterprisesEmployee-powered Content Marketing for Enterprises
Employee-powered Content Marketing for Enterprises
 
Understanding the potential of the Facebook Open Graph and Graph API
Understanding the potential of the Facebook Open Graph and Graph APIUnderstanding the potential of the Facebook Open Graph and Graph API
Understanding the potential of the Facebook Open Graph and Graph API
 
Understanding the Open Graph
Understanding the Open GraphUnderstanding the Open Graph
Understanding the Open Graph
 
Getting Started With Social Media Technologies
Getting Started With Social Media TechnologiesGetting Started With Social Media Technologies
Getting Started With Social Media Technologies
 
Social Media for Business - Presentation for Outsourcing Institute
Social Media for Business - Presentation for Outsourcing InstituteSocial Media for Business - Presentation for Outsourcing Institute
Social Media for Business - Presentation for Outsourcing Institute
 
Facebook Pages 101
Facebook Pages 101Facebook Pages 101
Facebook Pages 101
 
Crowdsourcing 101 - tapping into the wisdom of crowds
Crowdsourcing 101 - tapping into the wisdom of crowdsCrowdsourcing 101 - tapping into the wisdom of crowds
Crowdsourcing 101 - tapping into the wisdom of crowds
 
Social Apps 101
Social Apps 101Social Apps 101
Social Apps 101
 
Brands Can Make Friends Too
Brands Can Make Friends TooBrands Can Make Friends Too
Brands Can Make Friends Too
 

Recently uploaded

Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

SOCIAL TEXT ANALYTICS FOR ENTERPRISE AND CONSUMER APPLICATIONS

  • 1. Presented by Vidar Brekke, Social Intent LLC SOCIAL TEXT ANALYTICS FOR ENTERPRISE AND CONSUMER APPLICATIONS The International Association of Software Architects. October 23, 2012 @ividar #nlproc
  • 2. What is Text Analytics? Processes that uncover business value in A unstructured text via the application of statistical, B linguistic, machine C learning, and data analysis and visualization techniques @ividar #nlproc 2
  • 3. Text analytics help answer business questions faster and cheaper than before, uncovering new, hidden insights! @ividar #nlproc 3
  • 4. Text analytics is a Big Data problem Volume Velocity Variety Hundreds of languages Social media, help inquiries, email, texts, surveys 10.2 Million tweets sent Cryptic (vertical during the first Formal, inform industry or presidential al or criminal activity) debate ridiculously informal @ividar #nlproc 4
  • 5. I’m So Intextuated With You Unstructured text represents the biggest opportunity and problem in Big Data Text, as opposed to most other enterprise data, it’s very dirty data @ividar #nlproc 5
  • 6. Correlating consumer confidence with mentions of “jobs” on Twitter @ividar #nlproc 6
  • 7. Yay! Steve Jobs launches a new iPhone! @ividar #nlproc 7
  • 8. You can trade on Twitter @ividar #nlproc 8
  • 9. Low Signal/Noise Ratio + Naïve Metrics Lead to Wrong Conclusions • Lack of relevance: Many conversations you think are about you, aren’t. • Poor accuracy: Many automated sentiment solutions are as good as a coin flip. • Generic: All analysis is applied the same way across domains • Language Evolves: Slang, sarcasm is rampant in social media. Dictionary-based approaches are largely ineffective. @ividar #nlproc 9
  • 10. Relevancy: It’s not all about you. Let me finish my drink before you drive me to the Betty Ford clinic! Call me a bigot, but white guys can’t sprint! #london2012 My husband is such a baby. He won’t even taste raw food. Is Delta’s food prepared by Purina? So much for first class. @ividar #nlproc 10
  • 11. Search and Destroy (the data you’re looking for) Text analytics got traction in the 80s, but the use-cases were different than today. “Word spotting” – not different from a Google search. Show me all documents containing: Ford NOT Harrison But it doesn’t scale @ividar #nlproc 11
  • 12. Booleans are like woodcarving with a chainsaw Query: Ford NOT Harrison …. …would miss this tweet Carguy231: Me and a dozen others have lined up outside the Harrison, NY Ford dealership to test drive the new Fusion! @ividar #nlproc 12
  • 13. Booleans are like woodcarving with a chainsaw Query: Ford AND Fusion…. …would get this tweet Roadrunner123: Stuck with my dad in his ford listening to horrible jazz fusion @ividar #nlproc 13
  • 14. Sentiment Analysis Early sentiment analysis tools also use word spotting. “Awesome” = good “Sucks” = bad What about sarcasm, slang, new words? Additionally, the analysis is typically on overall contextual polarity, rather than targeted. “I love the new Camaro, it’s better than the Mustang” @ividar #nlproc 14
  • 15. You can’t use word spotting for sentiment detection “It took all morning to sign the lease papers for my new Mustang!” “I stood on line all morning to get the last Mustang on the lot!” “The brakes on the Mustang are surprisingly unpredictable.” “The TV ads for the Mustang are surprisingly unpredictable!” “The Mustang has never been good” “The Mustang has never been this good” @ividar #nlproc 15
  • 16. Nu-School text analytics is based on Machine Learning Using training-data to help the system to recognize patterns. We develop a statistical probability that a sentence is positive, negative, etc. What are training data? These are samples of text annotated by humans in an effort to show the machine what the right answer is “I love my iPhone, but hate AT&T” | iPhone | Positive | AT&T | Negative Much easier and quicker to develop new languages than dictionary based approaches @ividar #nlproc 16
  • 17. Test: What’s the sentiment here? “Reuters reports that Assad continues the massacre of his own people amid sanctions from the international community.” @ividar #nlproc 17
  • 18. How to evaluate a text analytics platform The accuracy of a sentiment analysis system is, in principle, how well it agrees with human judgments. “I can’t believe the bar has a hidden gambling room in the back!” An automated system can never be better than humans. Or can it? @ividar #nlproc 18
  • 19. Using Human Parallel Coding to Establish Gold Standards Confusion Matrix: Human as Gold Standard POSITIVE NEGATIVE NEUTRAL TOTAL POSITIVE 365 24 159 548 NEGATIVE 57 81 65 203 Raw Accuracy: 61.5% NEUTRAL 274 60 415 749 TOTAL 696 165 639 1500 If human agrees with a machine around 60% percent of the time, the machine would be performing as well as a human being. @ividar #nlproc 19
  • 20. Using A Credit Matrix to Create Improved Measurement POSITIVE NEGATIVE NEUTRAL POSITIVE 100% 0% 50% NEGATIVE 0% 100% 50% Credit Matrix NEUTRAL 50% 50% 100% Partial Credit Figure of Merit: 82.3% POSITIVE NEGATIVE NEUTRAL Confusion Matrix: POSITIVE 365 24 159 Human 1 as Gold NEGATIVE 57 81 65 Standard NEUTRAL 274 60 415 @ividar #nlproc 20
  • 21. Precision & Recall (sentiment as an example) Precision is the fraction of retrieved instances that are relevant E.g. How many instances labeled as positive, were actually positive Recall is the fraction of relevant instances that are retrieved E.g. How many positive instances the system detected compared to all positive instances. @ividar #nlproc 21
  • 22. Top business applications of text/content analytics* *Alta Plana, 2011 • Brand / product / reputation management • Market research and social media monitoring, i.e. what are people saying about my brand or products • Voice of the Customer / Customer Experience Management • Do I need to step in and offer customer service? • How many people recommend my brand vs. advocate against it? • Search, Information Access, or Questions Answering • Which bloggers are negative toward Obamacare? • Which of the hotels on Yelp.com get great reviews for the room service? • What are some articles similar to this one? • Competitive intelligence • What competing products are people considering and why • Are competitor’s media spend generating purchase intent? @ividar #nlproc 22
  • 23. Growing areas for is text analytics being applied Product development Intelligence and counter-terrorism, law enforcement Pharmaceutical drug discovery Financial services and insurance Media, publishing & advertising Political research CRM @ividar #nlproc 23
  • 24. Still awake? There is money in text analytics. Here’s a stock tip worth the price of admission alone (YMMV….) @ividar #nlproc 24
  • 25. Strange Bedfellows Whenever Anne Hathaway's name appeared with any regularity in news stories, Berkshire Hathaway A shares rose in value. @ividar #nlproc 25
  • 26. Thx & txt u l8tr Vidar Brekke vidar@socialintent.com @ividar @ividar #nlproc

Editor's Notes

  1. The green cells here are where the two coders agree. We can use this to derive a “raw” accuracy score. We add up the total number of instances where the two coders agree (the green cells) and divide by the total number of instances (1500) – to get a raw accuracy score of 61.5%.This raw accuracy score provides the first benchmark against which we can assess machine performance. Put concretely, if we can get a machine to classify documents for sentiment where a human would agree with its classifications around 60% percent of the time, our machine would be performing as well as a human being.
  2. Remember, we said before that not all mistakes are made equally. It depends on the use to which you’re putting the data. In most situations, however, it’s worse to mislabel something positive as negative than it is to mislabel something positive as neutral. This is true both for a human or machine coder.We can factor in these relative weights by using what is called a Credit Matrix. This says that you get 100% when your label agrees with the gold standardUltimately, the PCFM will establish the baseline against which we measure the performance of our machine learning algorithm.