Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Enterprise Applications of Text Intelligence - Lecture slides

1,703 views

Published on

Text Analytics can be used in business for various purposes. Business managers, and students, should have a clear idea of the use cases and a sound general understanding of the technical basics to be competent for business innovation and development. This set of slides (excerpts) is my approach to teach the subject. Comments welcome.

Published in: Business

Enterprise Applications of Text Intelligence - Lecture slides

  1. 1. 7,814 Enterprise Applications of Text Intelligence Autumn Term ‘15 – Prof. Dr. Andrea Back, IWI-HSG
  2. 2. 1 Kick-off
  3. 3. Simple applications that you already know
  4. 4. 3 Google Trends
  5. 5. 4 Wordle
  6. 6. 5 Social Media Text Analytics
  7. 7. 6 Turnitin: OriginalityCheck
  8. 8. 7 Google Ngram Viewer Religion vs. Science & Freedom vs. Justice With Google‘s new tool Ngram Viewer, you can visualize the rise and fall of particular keywords across 5 million books and 500 years.
  9. 9. 8 Google Ngram Viewer See how big cocaine was in Victorian times
  10. 10. Many text analytics users don’t understand that they’re doing text analytics, nor do they need to. (Alta Plana 2014)
  11. 11. 10 Definition and State of Practice Text analytics, applied to social, online, and enterprise data, aims to extract useful information and create usable insights for business, personal, government, and research ends. While not every analytical application directly involves text, every task – including analyses of “machine data” and transactional records – may be enriched by the inclusion of text-sourced information. There is no single or typical text analytics user, application, technology, or solution. Users and uses vary by industry, business function, information source, and goal. Source: Grimes, 2014
  12. 12. We focus on Enterprise Applications New reportings enable fast detection of issues and trends 11 0 2000 4000 6000 8000 10000 12000 #Messages Senderreihenfolge Bildstörung Internetgeschwindigkeit Internetausfall Router Issue Issue Source: Del Ponte et al., 2015 Swisscom Call Center
  13. 13. ABB Use Case
  14. 14. ABB Content Health Panel Some fact about ABB 13 ~150,000 employees Present in countries +100 Formed in 1988 merger of Swiss (BBC, 1891) and Swedish (ASEA, 1883) engineering companies in revenue (2013) $42b
  15. 15. ABB Content Health Panel Biggest challenge for improving content effectiveness 14  Directives & Guidelines do not drive change 1 million pages 250,000 products 3,500 editors 27 languages 20 web applications60 country web sites How to drive scalable change?
  16. 16. Course structure
  17. 17. Our Schedule Date & Time Room Unit 14.09.2015 14:15 - 16:00 23-203 1 – Kick-off and Introduction 21.09.2015 14:15 - 16:00 01-111 2 – Webinar Topic Analyst 28.09.2015 14:15 - 16:00 23-203 3 – Overview of Use Cases and Technical Basics Guest lecture S. Paulutt, CID 12.10.2015 14:15 - 16:00 alternative 16:15 – 18:00 23-203 4 – Technical Basics and Terminology 19.10.2015 14:15 - 16:00 23-203 Written exam 16
  18. 18. Our Schedule Date & Time Room Unit 26.10.2015 14:15 - 16:00 01-307 5 – Guest lecture T. Lehr: Social Media Monitoring 09.11.2015 14:15 - 18:00 EXT 6 – Google Zuerich (Google@Zue – Office Tour – Advanced Search – Industry Analytics) 16.11.2015 14:15 - 16:00 EXT 7 – Swisscom Innovation Zuerich 23.11.2015 14:15 - 18:00 EXT 8 – AXA Winterthur, Winterthur 30.11.2015 14:15 - 18:00 EXT 9 – SIKA, Urdorf 07.12.2015 14:15 - 18:00 23-203 10a – Guest lecture Dr. Scholer, Audi 10b – From TA-Idea to Adoption 10c – Guest lecture Chr. Smiela, Swiss Re L&H 10d – Course wrap up and final steps 17
  19. 19. Themes of the course End-users and Line-of-Business-People as target groups 18 Persons Documents Software Didactical methodology Motivation Technology Basics Use Cases for Text Analytics & Benefits for Business Software Tools Application of Knowledge in Case Studies
  20. 20. Introduction of course participants
  21. 21. www.aback.iwi.unisg.ch
  22. 22. Students
  23. 23. Guest Lecturers and Host Companies 22 Tobias Lehr, ex Goldbach Int. 26.10. 30.11.23.11. Dr. Florian Hamel, Axa Winterthur Christian Frey, SikaP. Warnking Stephanie Paulutt Christoph Smiela, Swiss Re
  24. 24. Organizational issues
  25. 25. Your personal grade results from Total 100 points 24 40 individual points Written examination 60 group-points 30 points Groupwork 1 30 points Groupwork 2 (all group members given the same grades) 5 bonus points
  26. 26. 25 Lectures and obligatory readings See reading list Evaluation of Learning Outcome Comprehension of content Communication and presentation of results What earns you points Exam (40 points) Learning goals:  Build expertise required on end-user side for competent use  Understand the use cases for text analytics  Learn the basics about technology  Being able to administer and support text analytics products as line of business user
  27. 27. What earns you points 2 x Groupwork (60 points) 26 Excursions and Case Studies Evaluation of Learning Outcome Comprehension of content Application of what you have learned on real cases Creativity and self-initiative Communication and presentation of results Learning goals:  Consolidate your theoretical knowledge through application on real cases  Gain practical insight, get to know experts, and learn how they solve business problems using text analytics
  28. 28. Process and due date for assignment to - 1 team and - 2 excursions that are credit- relevant for each team to be announced via StudyNet
  29. 29. Planning your time (3 credits – 90 hours) Our estimate Task Estimated hrs Actual hrs 1 – Kick-off and Introduction 7 2 – Webinar Topic Analyst 5 3 – Overview of Use Cases and Technical Basics including reading Assignment 8 4 – Technical Basics and Terminology 5 – Guest Lecture Social Media Monitoring 6 Exam 8 28
  30. 30. Planning your time (3 credits – 90 hours) Our estimate Task Estimated hrs Actual hrs 6/7/8/9 – Enterprise Application Visit (I. of 4) 13 6/7/8/9 – Enterprise Application Visit (II. of 4) 13 10 – Guest lecture Audi AG, From TA-Idea to Adoption, Guest lecture SwissRe, Course wrap up and final steps 6 Groupwork completion and finalizing excursion reports 24 29
  31. 31. Readings – Obligatory – Recommended
  32. 32. What to read and where to find it 31  Obligatory  Gartner. (2014). Technology Overview for Text Analytics.  Gartner. (2012). Who‘s Who In Text Analytics, 1-5.  Mapegy. (2013). Solar Technology: Asia‘s Innovations are Pushing Into the Fast Lane.  Recommended  Gartner. (2015). Four Data Preparation Challenges for Text Analytics.  Sack, H. (2014). Knowledge Engineering with Semantic Web Technologies. The texts can be found on StudyNet.
  33. 33. Motivation & Relevance
  34. 34. 33 There exists a significant shortage of text analytics talent within both IT departments and business units.
  35. 35. Many business analytics leaders do not realize that certain kinds of business problems can be resolved by text analytics.
  36. 36. 35 Organizations’ lack of focus on nontraditional and unstructured data sources as valuable resources hinders enterprise wide text analytics adoption and therefore remains a significant blind spot for most enterprises.
  37. 37. 36 Text Analytics is expected to go into the stage of Slope of Enlightenment in the coming few years because it will become more prevalent and because of the opportunities to be realized.
  38. 38. Gartner Analysts Found (2014) Fragmented Provider Market: Categories and examples  Small start-ups  Established technology and solution companies  Large, global information technology brands 37
  39. 39. Gartner Analysts Found (2014) State of the Provider Market 38 A fragmented market and terminology :  Software Tools  Natural Language Processing  Text Analysis Workbenches  Social Analytics Dashboards  Integrated Data Analysis Environments  Solution Embedded Technologies
  40. 40. Your To-dos for the next session: 21.09.2015
  41. 41. To-do-List 40 Read the CID-Website with focus on Topic Analyst as preparation for the upcoming guest lecture
  42. 42. Bonus points: – Post questions about the CID software on StudyNet – No later than Thursday noon
  43. 43. 7,814 Enterprise Applications of Text Intelligence Autumn Term ‘15 – Prof. Dr. Andrea Back, IWI-HSG
  44. 44. 2 Webinar
  45. 45. 4 CID Topic Analyst  Klick image – on webpage, bottom left, listen to audio (1 min)
  46. 46. 5 CID Topic Analyst Companion for Mobile  Klick image – on webpage, bottom left, listen to audio (1 min)
  47. 47. 6  http://cid.com/images/mediacenter/broschures/topic-analyst.pdf  http://cid.com/images/mediacenter/broschures/topic-analyst- companion-for-mobile.pdf CID Topic Analyst and Companion for Mobile Recommended brochures
  48. 48. Demo Videos
  49. 49. 8 CID Topic Analyst Benefits Live!
  50. 50. 9 CID Topic Analyst QuickDemo Aerospace
  51. 51. 10 Webinar on CID Topic Analyst Login data  Adobe Connect for Video  https://meet72195230.adobeconnect.com/topicanalystunistgallen/  Conference Call for Audio  Number: 0800 89 00 93  Participation code: 73 42 03 04 The Webinar will be recorded and published on StudyNet.
  52. 52. Your to-dos for the next session: 28-Sep-2015
  53. 53. 3 (of 5) Bonus points: – Post questions about the webinar on StudyNet – No later than Thursday noon
  54. 54. 7,814 Enterprise Applications of Text Intelligence Autumn Term ‘15 – Prof. Dr. Andrea Back, IWI-HSG
  55. 55. 3 Overview Use Cases & Technology
  56. 56. Guest Lecture S. Paulutt CID Consulting (see separate slideset on StudyNet)
  57. 57. Student Bonus Q&A (ABack)
  58. 58. 5 Student Bonus Q&A – 1 (out of 10) Mivelaz Vincent-Frédéric If you had to name one weakness regarding the Topic Analyst tool, what would it be?
  59. 59. 6 Competitive Intelligence (CI) Project Seminar: Enterprise 2.0 and Mobile Business Universität St.Gallen Comparison of the Topic-Analyst-based Analysis with the Current Approach - Using Digital Banking CI as an example Projektteam: 4 HSG Students, Master of Business Innovation Betreuung HSG: Christian Ruf, Lehrstuhl Prof. Dr. Andrea Back, Institut für Wirtschaftsinformatik (IWI) Betreuung CID: Stephanie Paulutt St. Gallen, March 19th, 2014 6
  60. 60. 7 Current Practice ProcessesTopic-Analyst-based Process  Time required  Quality of Output  Quality of Work-/Information Flow
  61. 61. 8 Topic Analyst Analyst News provided by crawler Quick dashboard monitoring (Use Case 1) Selection of topics for the Deep Dive Doing the Deep Dive (Use Case 2) TA supports Deep Dive Supervised Topics support monitoring Comments and tagging of articles in TA Presenting the results Topic-Analyst-based process Use Case 1: Monitoring Use Case 2: Deep Dive TA: Topic Analyst
  62. 62. 9 Monitoring – A challenge through the TA-process Tags are very useful for collaborative work, but they need to be predefined and used consistently!
  63. 63. 10 Deep Dive – A suggested new feature Documents, once in the Corpus, lose links Improvement: Ability to import external documents Bisher Mit Topic Analyst Link Topic Analyst Original (if found via Google e.g.)
  64. 64. 11 Student Bonus Q&A - 2 Bernauer Patrick How can a customer analyse the effectiveness of text analytics? What are typical measurements/key figures?
  65. 65. 12 Can text analysis effectively assess the quality of «ideas» submitted to online idea contests?* „Typical“ instance of an ideation contest: •430 users submitted •725 ideas • or 42‘094 words • or 113 pages of plain text (in Arial, 10) Source: MySQL dump from a crowdsourcing platform. * PhD thesis Thomas Walter 2013
  66. 66. 13 How to define ideation quality? Idea quality usually consists of four distinct dimensions, but most important in crowdsourcing (according to literature) is novelty: Novelty Feasibility Elabo- ration Strategic relevance Ideas should be… • unique and rare, • original and not yet expressed by anybody, • not related with others, • revolutionary and radical • with ability to surprise • imaginary and unexpected.
  67. 67. 14 Text Mining Methodology Text mining is the semi-automated process of extracting patterns (useful information and knowledge) from large amounts of unstructured data sources. Text mining works by transposing words and phrases in unstructured data, such as submissions to crowdsourcing websites, into numerical values (Pre-Processing) which can then be analyzed with data mining techniques. Pre-Processing • Tokenization • Stemming • Part-of-speech tagging • Stop word clearance • Term Document Matrix Text Mining (novelity) • Frequency analysis • Categorization • Recognition of speech • Clustering • Sentiment Analysis Visualization • Tag Clouds • Tree Maps • Assoziation-graps • Theme River • Coloring
  68. 68. “I like computing. But I hate computing on old computers.”  Tokenization: breaking a stream of text up into words, phrases (here) or symbols:  1: I like computing.  2: but I hate computing on old computers.  Stemming: reducing inflected words to their stem:  1: i like computing.  2: but I hate computing on old computers.  Stop-word cleaning: predefined list of so called stop-words are deletes:  1: i like comput  2: but I hate comput on old comput  Part-of-speech tagging: marking up a word corresponding to the syntax based on both its definition, as well as its context.  1: like (adjective) comput (verb)  2: hate (adjective) comput (verb) old (adjective) comput (verb) 15 The corpus is deconstructed during pre-pocessing
  69. 69. 1. “I like computing. But I hate computing on old computers.” 2. Term Document Matrix (TDM): describes the frequency of terms which occur in a collection of text (the corpus). In a TDM, rows correspond to documents (D) in the collection and columns correspond to terms (T). 3. The TDM is the basis for applying text mining algorithms, e.g: – Word-Frequency lists – Text-Categorization – Clustering/ pattern search – Language recognition – Sentiment analysis 16 TDM like comput hate old pos verb verb verb adj d1 1 1 0 0 d2 0 2 1 1 TDM is the basis of text mining
  70. 70. 17 Automatic selection / recommendation Idea 1: Select only single case cluster
  71. 71. 18 Automatic selection / recommendation • Example: Contest of a clothing manufacturer looking for self-cooling cloth for the olympics. • Single item cluster defined by an idea suggesting „super absorbant polymeres (called SAPs)“.
  72. 72. 19 Student Bonus Q&A - 3 Sukula Heini Elina On average, which source of information (e.g. web sites, social media, enterprise software) is the most used one for text analysis?
  73. 73. 20
  74. 74. 21
  75. 75. 22 A Model to help define relevant social media Practice Project 2011, using IFA as an example … – … - description – CI relevance
  76. 76. 23 Student Bonus Q&A - 4 Balzer Lukas How is the legal situation with web scraping? How do you deal with it?
  77. 77. 24
  78. 78. 25
  79. 79. 26 It is «complicated» – and (Swiss) law-in-progress Details and current legal cases would be worth a BA- or MA-thesis • No unique copyright law for data bases • Protection under law for «Sammelwerke» (collective works) difficult • To prosecute parasitic exploitation of others’ databases is very difficult, and respective lawsuites are very risky • Courts keep the freedom to «copy/imitate» very high • That the law against unfair competition/practices (UWG) is effective, needs verification • Contractual solutions are encouraged
  80. 80. 27 Student Bonus Q&A - 5 Cantù Rossana Violante Where do you get the data from? Does Corpus need to buy any (additional) data packages? ABack @SPau: «Could you provide e.g 5 data sources that you use for FREE (e.g. Wikipedia), and 5 data sources that you have to pay for, and how much? (e.g. French dataset that is Wikipedia like)“. Spau: – Sure, I will prepare a list.
  81. 81. 28 Student Bonus Q&A - 6 Högdahl Louise Maria Irene How does CID’s customer relationships look like? Is Topic Analysis something your clients use during a short time period or do they get more interested as they start to use it?
  82. 82. 29 IFA 2011 pilot - turned into a regular customer Thrust of the 2011 pilot for IFA Competitive Analysis • Number of articles in media (traditional and Social Media) • Associations with Vendors und current Topics Trendanalysis • Current Trends in Consumer Electronics Adhoc Analysis • Media response to a Press Conference 2016 update May 2011
  83. 83. 30 Student Bonus Q&A - 7 Fontugne Louise What are the issues with the law for companies to analyse data? Are there data that companies are not allowed to have/ analyse? What could be the sanctions if companies have data that they are not supposed to own?
  84. 84. 31 Monitoring Employees’ eMail The legal situation in Germany and Switzerland** - details and current practices would be worth a BA- or MA-thesis ** quick assessment of a HSG law student, and what we could find – HSG expert: Prof. Dr. Roland Müller, Titularprofessor für Arbeitsrecht In general, systematic monitoring of the entire email traffic is not allowed. It is considered undue. For supervising abuse e.g., milder methods like sampling are a similarly effective approach. Switzerland • Absolutely forbidden is monitoring private email of employees. Even if the employer forbids private mails, he is not allowed to “read” mails that are labeled as private. To supervise the prohibition, it is only allowed to read the topic-line of emails. • Monitoring/Reading business email is allowed, if it is reasonable, and employees must be informed beforehand. Germany • If private email is not allowed, the employer may monitor all email, unless it is marked as private. • If private email is allowed, the employer is seen like a telecom provider and may not monitor email (Telekommunikationsgesetz, Fernmeldegeheimnis)
  85. 85. 32 Student Bonus Q&A - 8 Rivera Caballero Jaime Alejandro You also have a product called Topic Analyst for Mobile. This app is only available for IOS, is that because it fits better with your target market or are you considering also to include android systems in the future? ABack @SPau: «What criteria are relevant for the decision of which App store to use?» Spau: Providing an app is still a competitive advantage, as it contributes to differentiate our solution from Other^vendors‘. Regarding Android, we follow customer demand. As soon as more users demand it, we will provide it.“
  86. 86. 33 Student Bonus Q&A - 9 Jüllig Hanna Kristina Elisabeth According to your website, with Topic Analyst companies can collect information from a huge amount of channels, but at the same time only get "the relevant" information. Who/what is it that really decides what information is relevant? And how to ensure your clients' technological maturity? «See the above mentioned model to define relevant social media channels. The choices are usually made in a consultative process with the customer.» Another interesting example , how to define «relevant Information» is our project to use text mining to help filter relevant feedback given via the Lufthansa App.
  87. 87. 34 Customer Feedback via Lufthansa Mobile App Can textmining help to detect relevant customer feedback automatically? *
  88. 88. 35 Precision Recall* F1* Categorizaation 38 % 61 % 46 % Speech Recognition 54 % 92 % 68 % Sentiment Recognition 51% 83 % 63 % Project Result – Text Mining Dashboard
  89. 89. 36 Student Bonus Q&A - 10 Cyriax Jörg Stephan Do you work together with computer scientists from universities to improve your product? SPau: «Yes, we have a research partner, Prof. Dr. Gerhard Heyer of University of Leipzig. http://asv.informatik.uni-leipzig.de/staff/Gerhard_Heyer The do not develop software components. In my guest lecture, I will give insights into a current project.
  90. 90. 7,814 Enterprise Applications of Text Intelligence AutumnTerm ‘15 – Prof. Dr.Andrea Back, IWI-HSG
  91. 91. More Use Cases
  92. 92. Use Cases Clustering by application area* Customer service and improved productivity most widely distributed 17 28 31 17 6 0 5 10 15 20 25 30 35 Competitive Intelligence Customer Service Improved User Productivity Operation Excellence Risk and Fraud Management # Distribution of application areas * along:Yuen, D., Linden,A., & Koehler-Kruener,H. (2014).Technology Overview for Text Analytics. Gartner Inc.
  93. 93. Competitive/ Market Intelligence
  94. 94. 5 „Wiesenhof“ Lawyers find „Wiesengüggel“ and start a brand protection battle http://www.bauernzeitung.ch/sda-archiv/2015/streit-um-marke-wiesengueggel-beigelegt/
  95. 95. 6 mapegy The Innovation Graph - mapping global innovation for everyone Tobias Wagner - mapegy GmbH wagner@mapegy.com +49 (0)30 430 2212 0 www.mapegy.com
  96. 96. 7 Innovation Graph 4 Retrieving, mapping and evaluating the global technology dynamics ● more than 5 Mio. institutions (companies, universities, more than 100K startups) ● more than 50 Mio. experts ● Billion of networks, cooperations and clusters ● Billion of topics and technology fields ● more than 20 years of trends ● more than 200K locations on basis of ● more than 100 Mio. patents from >150 patent offices ● more than 100 Mio. scientific publications ● more than 10K technical standards ● more than 1 Mio. press, product releases and social media publications daily ● and Mio. websites Solution behind solution
  97. 97. 8 Use Case Topic Industry Project type Technology Scouting Various questions related to innovation management (see use cases below) Automobile supplier mapegy.scout (users: 2) Stakeholder monitoring Analysis of competitors’, suppliers’, customers’ know-how & R&D activities in autonomous driving technologies Automobile/ ICT industry mapegy.radar (users: 5) Trend analysis Identification of technology trends & new technology fields in the field of clinical management systems Health care industry mapegy.radar (users: 3) M&A, Headhunting Identification of potential acquisition candidates & domain experts in display technologies Consumer electronics mapegy.radar (users: 3) Portfolio analysis Evaluation of client’s fuel cell IP portfolio (strengths/weaknesses) in order to buy missing knowhow Automobile mapegy.radar (users: 2) New business field exploration 360° analysis of technologies, trends, stakeholders in pipeline technologies Oil & gas industry mapegy.radar (users: 10) Strategic Product Management Monitoring all environment-relevant features of competitor ‘s cars. Automobile mapegy.radar (users: 10) 5 USE CASES Customer success stories from diverse industries
  98. 98. 9 USERS 6 Technology, Innovation & Portfolio-Managers who want to develop their technology stack Investors, M&A & Business analysts who want identify & evaluate new technology business opportunities Researchers & developers who want to explore new technology frontiers IP Professionals who want to make the best out of their IP Government organizations who support regional development HR Managers who are in search for the best technology experts Everyone pushing technology Sales & Market researchers or Product developers who want to understand better technologies relevant for their markets or products
  99. 99. 10 BENEFITS Weeks of laborious research are condensed into minutes or even seconds Having a finger on the pulse of time & staying ahead of competition Know market, competition, dynamics Map technologies with products Identify right people & partners Faster to market Identify threats & opportunities Save costs 7
  100. 100. 13 10 Locate institutions and experts globally/locally
  101. 101. 14 Explore their cooperation 11
  102. 102. 15 12 Analyze rankings for various KPIs
  103. 103. Customer Service –Voice of the Customer «B2C, B2B»
  104. 104. Text Analytics at Swisscom Initial situation Reporting Monthly manual analysis (based on a cutoff date) of the service requests using tally sheets Recording Agents record issues as „service requests“ Signalling Clients call Swisscom and describe a problem
  105. 105. § Unreliable § Resource-intensive § Non-standardized § Incomplete § Time-delayed Text Analytics at Swisscom Initial situation
  106. 106. Stakeholder needs Initiation of a project to analyze service requests automatically and promptly (near realtime) Text Analytics provides the technical foundation for the project Solution 1. Short periodicity & regular analyses 2. Detailed & granular problem clusters 3. Little effort 4. Automation Text Analytics at Swisscom Business Requirements
  107. 107. SAS Text Analytics enables an end-to-end automation of the analysis process of service requests Daily & automatically X = Y Dataload Daily basis Language recognition DE, FR, IT, EN? Block building Units of meaning Synonym recognition Remote = Zapper Categori- zation Remote belongs to TV Evaluation Trending topics/ comparison with previous day or period 21 3 4 5 6 Text Analytics at Swisscom What is possible through technology ...
  108. 108. 0 2000 4000 6000 8000 10000 12000 #Messages Senderreihenfolge Bildstörung Internetgeschwindigkeit Internetausfall Router Issue Issue New reportings enable fast detection of issues and trends Order of stations Image interference Internet speed Internet failure Router Text Analytics at Swisscom What changes ...
  109. 109. Customer Service «Social Media Monitoring and Sentiment»
  110. 110. 23 Some 101 of Social Media Monitoring Brand Monitoring. Easy Slide:Courtesy of Dr.ThomasWalter,HSG alumnus (currentlyNamics AG)
  111. 111. 24 … Some 101 of Social Media Monitoring Brand Monitoring. Medicore Slide:Courtesy of Dr.ThomasWalter,HSG alumnus (currentlyNamics AG)
  112. 112. 25 Some 101 of Social Media Monitoring Brand Monitoring. Hard Slide:Courtesy of Dr.ThomasWalter,HSG alumnus (currentlyNamics AG)
  113. 113. § Posts are distributed via various channels (there is no standard way of complaining, commenting and so on). § Post are heavily connected (“retweets”,“likes” etc. cause redundancy). § Often responses take place on other channels. § Social media platforms undergo continuous change. § Significance of posts is context specific. § Language can be highly complex to measure (irony, sarcasm, dialect, wordplay, language barriers, etc.). § But most important: Do you know what you are looking for? Challenges of Monitoring Social Media Slide:Courtesy of Dr.ThomasWalter,HSG alumnus (currentlyNamics AG)
  114. 114. Open terms,low boundaries Explicit terms,high classification Posts found Post US postal service, correspondence Description used in relevant mentions (?!) “The service of the post sucks so much, …” “Read nice blog post from…” “Guess what was in my post box today?” Social Media MonitoringTradeoff Slide:Courtesy of Dr.ThomasWalter,HSG alumnus (currentlyNamics AG)
  115. 115. Classification and sentiment analysis are key text mining techniques to leverage social media monitoring. Sentiment140 –Twitter Sentiment Analysis Slide:Courtesy of Dr.ThomasWalter,HSG alumnus (currentlyNamics AG)
  116. 116. § Again, it’s not that much about the technology you use, but about how you address the topic. § After detailed analysis how social media talks about your brand, write down a plan for you and your staff, including how to respond to: § Direct Questions § Positive Comments § Negative Comments § Shitstorms § Incorrect Information § Feedback Respond Reply Reach outRetweet Final thoughts Slide:Courtesy of Dr.ThomasWalter,HSG alumnus (currentlyNamics AG)
  117. 117. Improved User Productivity
  118. 118. TopicAnalyst® Companion forWord and Outlook
  119. 119. 3M 360 Encompass System
  120. 120. 3M 360 Encompass System The 3M 360 Encompass System integrates § computer-assisted coding (CAC), § clinical documentation improvement (CDI), § concurrent quality metrics and analytics into one application to § capture, § analyze § and advance patient information across the care continuum
  121. 121. Operation Excellence
  122. 122. 1. Analysis only once a month 2. Time-consuming and manual process 3. Analyzed for a single reference date 4. Error-proneness and subjectivity 5. Just rough analyses possible Until now UsingText Analytics § Daily evaluation § Fast automated process § Representative analysis § Standardized reports with clear criteria § Early detection of problems & trends § Detailed analysis of problem clusters Text Analytics at Swisscom Advantages
  123. 123. The introduction of the solution to reduce the number of tasks done by hand was just the beginning.More ideas are ready to be brought to life. § Analysis by handYesterday Today And tomorrow? Further developments § Automated analysis § Speech to text § Real-time analysis § Competitive intelligence § CRM – Enrichment of customer profiles Text Analytics at Swisscom Outlook
  124. 124. Risk and Fraud Management (our guest lecture today) @MBU: passendes Bild finden
  125. 125. More technical basics
  126. 126. Text mining steps Recapitulation of online idea contest example 39 Pre-processing Tokenization Stemming Part-of-speech tagging Stop word clearance Term Document Matrix Text Mining (novelty) Frequency analysis Categorization Recognition of speech Clustering Sentiment Analysis Visualization Tag Clouds Tree Maps Assoziation-graps Theme River Coloring
  127. 127. 40 First step:text Acquisition,e.g.social data Gnip -The World's Largest and MostTrusted Provider of Social Data
  128. 128. 41 Excursus:Role of text data in «Big Data» ?
  129. 129. 42 Further analytics step: Insights
  130. 130. Linguistic techniques in disambiguation of texts
  131. 131. 46 Linguistic techniques in disambiguation of texts Qualified vocabularies, lexicons and synonyms § Are there recognized dictionaries of accepted terms? § Are there alternate spellings for a given word? § Is there a thesaurus of synonyms? Drill ?
  132. 132. 47 Linguistic techniques in disambiguation of texts Grammar and syntax § Each language has its own system, structure and rules of grammar. § The syntactic relationship of words to their surroundings must be deciphered and interpreted through the relationship of words to their surrounding. § Even simple word order changes may significantly change the meaning of a sentence.
  133. 133. 48 Linguistic techniques in disambiguation of texts Standardization What provision is there to resolve known variations for a term to a single, agreed “core” value or to recognized hierarchies? =
  134. 134. 49 Linguistic techniques in disambiguation of texts Character and tone § What is the underlying intended pitch or inflection of the statement? § What intonation and attitude are being conveyed?
  135. 135. 50 Linguistic techniques in disambiguation of texts Literal and figurative language § Is the phrase precise and direct in its meaning,or is there some form of indirect connotation?
  136. 136. Tools for Social Media Monitoring &Text Mining
  137. 137. Tools Exemplary enterprise solution:Adobe Social
  138. 138. Tools There are many many more.Try them! Text Mining within a Digital Marketing System: § E.g.Adobe Social http://www.adobe.com/ch_de/products/social.html Special Software/ Projects: § GATE http://gate.ac.uk/ § Rapid Miner http://rapid-i.com/content/view/181/190/ § QDA Miner http://provalisresearch.com/ § Megaputer http://www.megaputer.com/site/index.php § Liste anTools http://www.kdnuggets.com/software/text.html Web-tools with Text-/ Data- Mining Analysis (short list): § Weblyzard http://www.weblyzard.com/ § Brandwatch http://www.brandwatch.com/ § Sysomos http://www.sysomos.com/ § Sentiment140 http://www.sentiment140.com/ § Sentiment viz http://www.csc.ncsu.edu/faculty/healey/tweet_viz/tweet_app/ § Wordle http://www.wordle.net/ 55 Slide:Courtesy of Dr.ThomasWalter,HSG alumnus (currentlyNamics AG)
  139. 139. Guest Lecture K. Schwenke KPMG,Technical Forensics (see separate slideset on StudyNet)
  140. 140. GROUPWORK Procedure & Tasks
  141. 141. Threefold task for each workgroup 2 1 - PREPARATION - questions/tasks see following slides - Submit 1-2 p. to andrea.back@unisg.ch 2 - ON-SITE AT EXCURSION - topic ad-hoc defined by host @ excursion - «submission» is your oral participation 3 - FOLLOW UP CREDIT-TASK - topic and format defined @ excursion - submit/email (ppt, doc, …) to ABack only
  142. 142. Recap: Planning your time (3 credits – 90 hours) Our estimate Task Estimated hrs per person Actual hrs 6/7/8/9 – Enterprise Application Visit (I. of 4) 13 ** 6/7/8/9 – Enterprise Application Visit (II. of 4) 13 ** 10 – Guest lecture Audi AG, From TA-Idea to Adoption, Guest lecture SwissRe, Course wrap up and final steps 6 Groupwork completion and finalizing excursion reports 24 3 days in other words 3 ** about 4-6 hrs for the three preparatory questions
  143. 143. 1 – Preparation: What to submit To document your preparatory work, please submit a max. 1-2 page text document describing: • HOW you found your results • WHICH of your findings you found most interesting • GLOSSARY TERMS that you suggest to add, including your short definition 4
  144. 144. 2 – On-site at excursion – bring your notebook Every host was asked to include a 45-60 min. interactive part, in which (especially) the groups make a contribution. Expect designs like these: • Discussion groups with presenting • using a flipchart • using moderation cards • using a mind-mapping tool on your notebook • … • Interactive Q&A session – blocked or intermittent • … but be open, since this is up to the host and beyond my control 5
  145. 145. 3 – Follow-up CREDIT-TASK To document your follow-up work, please submit a documentation in a format that suits the defined topic and what was agreed upon at the excursion • TIME-TO-INVEST: As a guideline, use the budgeted time per person that I gave in Unit 1 • WHAT THE HOST GETS: You will submit your work to ABack only. She will decide whether the work will be forwarded to the host company. If she forwards it to the host, all authors will be included in mail-CC, so you get notified about it. • SUITABLE FORMAT: The format should suit the topic as well as the style/wishes/culture of the host. Some might like PPT slides with annotations, others might like a text document. Even Prezis or videos can be an option. Make your choice. 6
  146. 146. Both Google groups, each please prepare 7 1. What companies does the newly formed Alphabet Holding entail? What are their missions? 2. What improvement did the acquisition of Metaweb bring to Google? And how is it related to text analytics/analysis? Here a good video for that 3. What terms (and short explanations) do you suggest to add to our Glossary? 14:15 – 18:00 Google Office, Zürich, Brandschenke-Str. 110 Patrick Warnking, Yves Brunschwiler, Philipp Probst
  147. 147. I - Google Search: Define xxtermxx» Assessment 8 1. Goal: The Google search function «Define xxtermxx will be tested and assessed. The characteristics of the results will be described, and the quality of the results will be evaluated along criteria chosen by the group. 2. All glossary terms (initial ones and suggestions in the preparatory work (till end-of Nov.)) will be «google-define-searched». 3. A glossary to use for the next course will be curated; the student group defines the format (need not be a *.doc or *.pdf) 4. Members: HSG and Guest Students
  148. 148. II – Brand Competitive Analysis with Google Tools A Comparison of CEMS Partner Universities 9 1. Brand competitive analysis of CEMS universities: University of St Gallen compared with the 6 following universities: Copenhaguen Business School (CBS), ESADE, HEC Paris, London School of Economics (LSE), Rotterdam School of Management (RSM) and Bocconi. 2. Analysis using the tools presented at the Google excursion, Nov. 9th, by Philipp Probst <pprobst@google.com> 3. Members: HSG and Guest Students
  149. 149. Both Swisscom groups, each please prepare 10 1. Find out what API economy means. Does it have relations to text analysis technology/- methodology/- business? 2. What questions would you ask Swisscom to find out whether they provide API services and what their plans are to identify and use API economy opportunities (for their own data)? 3. What terms (and short explanations) do you suggest to add to our Glossary? 14:15 – 18:00 Swisscom Office, 8005 Zürich, Pfingstweidstrasse 51, room 0.05 Dr. Falk Kohlmann, Lukas Peter, Kay Lummitsch, (N.N. Squirro)
  150. 150. 11 Swisscom Team I Everybody speaks about "Digital Transformation". Find out how the DT relates to APIs and which API products Swisscom shall offer to its Swiss customers & partners to make them & Swisscom successful in the digital world. (Questions, if needed, to Kay Lummitsch)
  151. 151. 12 Swisscom Team II Financial Industry. Digitalization is transforming the financial industry. With the e-foresight Think Tank, we identify at Swisscom new trends, analyze the impact on Swiss Retail Banking and accompany our customer banks on this path with high-class research. Find out how we can use a Big Data / Text Analysis Tool, such as Squirro, to provide new products/offerings regarding trend research for our customer banks. (Questions, if needed, to Dr. Falk Kohlmann)
  152. 152. Both AXA groups, each please prepare 13 1. AXA uses teams in India for its CI research purposes. Find out several companies in India (or other overseas countries) that offer such services. How do these services differ? 2. AXA is present in several social media platforms. Find out in which and what AXA does there (last 6 months) 3. What terms (and short explanations) do you suggest to add to our Glossary? 14:00 – 18:00 AXA Office, Winterthur, Superblock, Pionierstrasse 3 Dr. Florian Hamel, Gaetano Mecenero, Lukas Wille, Andreas Wendt
  153. 153. 14 Both teams same task, but different firms Your preparatory task was: “AXA is present in several social media platforms. Find out in which and what AXA does there (last 6 months)”. Now expand your analyses by including other insurance companies’ social media activities on: Linkedin, Xing – Facebook, Google+ - Youtube - Instagram - Pinterest. Your comparative study with AXA should include quantitative as well as qualitative assessments. • Group 1: Insurance firms USA: Progressiv – Geico – Prudential • Group 2: Insurance firms CH: Mobiliar – Zürich – Allianz The two groups please agree upon and use common analysis criteria to allow to combine the results. But write your discussion/conclusions independently. PS: You may explore the online textmining tool http://www.minemytext.com and decide, whether you get and add insights to your analysis gained by this tool (voluntary).
  154. 154. SIKA group, please prepare 15 1. Find out what “Predictive Analytics” means. Does it have relations to text analysis technology, -methods, -business? If you think yes, please elaborate. 2. What questions would you ask Sika to find out whether they use Predictive Analytics, and what their plans are to identify and use Predictive Analytics opportunities? 3. What terms (and short explanations) do you suggest to add to our Glossary? 14:15 – 18:00 Sika Office, Zürich, Tüffenwies 16 Christian Frey, Jacqueline Vo
  155. 155. 16 SIKA-Task: Visualizations in Market Monitoring What would be the added value of a) visualizations and b) continuous monitoring mode for typical Sika market intelligence reports. As an example, use the Euroconstruct Summary Report. • Consider what is visually possible especially with software solutions like CID, Squirro and Mapegy. • Present your ideas in kind of a “mock-up” dashboard, to also make your presentation in visual style.
  156. 156. 7,814 Enterprise Applications of Text Intelligence Autumn Term ‘15 – Prof. Dr. Andrea Back, IWI-HSG
  157. 157. 10 Adoption
  158. 158. Agenda 14:15 – 14:30 – Course Wrap Up • Welcome Bruno Zanvit • Credits and Grading: Written Exam – Bonus Points/Prep Work – Credit Group Task • Evaluation Forms emailed to You 14:30 – 15:30 – Guest Lecture Dr. S. Scholer, Strategische Unternehmensplanung, Audi Text Mining to Support Strategy Work 15:30 – 16:00; – 16:15 – Break Activities; Project venue CID • Break, including filling in Evaluation Forms • Implementation of a Project with Topic Analyst 16:15 – 17:15 – Guest Lecture Chr. Smiela, Life & Health Information Architecture & Integration, Swiss Reinsurance Company Text Analytics in Underwriting – Use Cases and Next Gen Platform 17:15 – 17:45 – Last not Least & Good Bye 2
  159. 159. Course Wrap Up
  160. 160. Q&As about - Bruno Zanvit - Credits & Grading Questions - Course Evaluation 4
  161. 161. Guest Lecture Dr. S. Scholer Audi (see separate slideset on StudyNet)
  162. 162. Guest Lecture Chr. Smiela Swiss Re (see separate slideset on StudyNet)
  163. 163. 7 Implementation of a project with Topic Analyst® Overview of the main steps December 2015
  164. 164. 8 Definition of use cases and scope (1/2) CID provides a checklist to the customer to define and prioritize use cases and the scope.
  165. 165. 9 Definition of use cases and scope (2/2)
  166. 166. 10 Organizational aspects and communication * Topic Analyst power users ** Recipients of push-alerts, dashboards, newsletters etc. provided by the analyst team
  167. 167. 11 Project implementation plan 1. Order and scope validation (customer) 2. Configuration of sources (crawlers, importers) (CID) 3. Adaptation of the Knowledge Base (CID) 3. Initial Crawling (CID) 4. Tuning (CID) 5. First milestone presentation to power users (both) • Match with the major use cases (both) 6. Finetuning (both) 7. Initial dashboard preparation for end-users & training (CID/ customer) 8. Jour fixe to discuss questions, tool features, new use cases etc. (both)
  168. 168. 12 Stephanie Paulutt CID GmbH Struthweg 1 D-63594 Hasselroth Germany www.cid.com Tel.: +49 6051-8846-111 E.: s.paulutt@cid.de twitter: @CIDGermany http://cid.com/products/topic-analyst-ecosystem
  169. 169. Last not Least
  170. 170. What else? - CID: Project steps with customers - MA-theses - Getting hired: Christian Lohri, Swiss Re: Video-Self- Interview (German) 14
  171. 171. Merry Christmas & Happy New Year 15

×