Text Mining Summit 2009 V4 (For General Presentations)

408 views

Published on

Presentation made by R. Scott Evans at the 2009 TA Summit - contains slide notes

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
408
On SlideShare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
17
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Modified version of the presentation at the Text Analytics Summit , June 2009, Boston, MA. Original presented by R. Scott Evans, PhD, VP, Harris Interactive.
  • Text analytics provides a much needed bridge between the richness and prescriptive qualities of qualitative methodologies and the more rigorous validation and predictive extrapolations derived from quantitative methodologies. By its very nature, the thematic structure generated by text analytics offers an integrative capability that permits the integration and synthesis of insights from disparate data sources.
  • Text analytics provides a much needed validation of quantitative analysis; and can be used to help understand what respondents mean when they provide a scaled response.Additionally, this provides a more contextualized analysis, where emerging themes are less driven by a priori research categories and concepts. In this context, classification can be more sophisticated and a less taxing method of building consistent thematic structures than traditional qualitative approaches. Catalogued text which can be easily organized in hierarchies and interlocking relationships, which more accurately reflect how respondents organize their conceptual world. Sentiment can be associated with thematically organized data and provide insight into dispositions that influence stated or observed behavior.
  • Successful application of text analytics is not “plug and play”. It does require front loaded commitment to designing thematic structures that address business issues, while ensuring that themes are consistent with what is found in the data. This requires a combination of rules-based learning and linguistically based machine learning. The subsequent repurposing of the ontologies derived from building a suitable thematic framework and associated training set means that replicating or updating the analysis with new data is both efficient and results in the consistent treatment of data.
  • The three general types of measures resulting from text analytics is themes, counts within themes, and average sentiment scores within themes. The key to successful analysis is to develop a thematic structure at the bottom nodes that is sufficiently rich to provide a detailed understanding of market discourse – but having a hierarchical structure that permits natural roll ups at higher levels. In other words, having parent, children and grandchildren nodes simplifies generalization. The children and grandchildren nodes provide a rich description of the nodes above them. So reporting can take place for parent nodes and the analysis can incorporate the insights derived from the sub-nodes .
  • The original case study was a mix of online and telephone interviews. The structured questions were extensive and the original design relies on quantitative measures for understanding account relationships. The open comments were to provide an anecdotal backdrop to support or add color to the quantitative findings.The original battery of performance questions forms a single factor in Principal Components analysis. What this means that distinguishing the importance of one question over another is extremely difficult. There is little variation in how respondents answer individual performance questions, with all questions skewed to the high end of the scale. Structured Data Problem: The singularity of the data makes it very difficult to identify salient performance issues and evaluate impact on likelihood to repurchase or recommend.Text Analytics: Provides an alternative way of assessing relative salience of performance by aligning themes with structured data and examining volume of comments and sentiment associated with themes that align with performance questions. This method also provides and independent alternative to the quantitative analysis by looking at sentence volume and average sentiment associated with themes derived independently from the textual data.
  • These were the issues that prompted focus on this particular data source.
  • Validation NeededThe reason why secondary validation of quantitative findings is so important is demonstrated in the abovecoefficient graph. It is here that we see some of the dangers of simplistic quantitative analysis, while at the same time pointing to the potential risks of using more advanced techniques. The tight clustering and size of the simple correlation coefficients in blue suggest strong relationships across all variables. They have the added quality of being relatively consistent year over year. Conversely, simple correlations using optimal scaled data (to reduce multicollinearity and potential spuriousness), and to a greater extent categorical regression using optimal scaled data, show greater differentiation in the relative importance of each variable. Even though the scaled correlations seem to show a select few variables that are both stable and high value, their bivariate nature makes it impossible to rule out spuriousness in such a multicollinear environment. While the scaled categorical regression handles the problem of spuriousness, the increased volatility in year over year comparisons renders its conclusion somewhat suspect. The results of the quantitative analysis create an important interpretive dilemma. On the one hand, the more advanced techniques offer a more effective method of identifying the most important factors contributing to account development. On the other, the inconsistency in the year over year results suggests that a more advanced model may be highly volatile and may not be accurately assessing the relative importance of each variable influencing customer loyalty. While such volatility could be a characteristic of the population, it is unlikely given that the accounts this study represents involve key client relationships where instability should be the exception.
  • A careful comparison of the thematic structure that was created using text analytics, with previous attempts at analysis, demonstrate the more methodical and exhaustive approach offered by a text analytics platform. The identification of non-significant or redundant themes in the prior analysis was based on a theme having either few assigned comments or having few distinguishing characteristics with another theme.
  • This slide is a graphical depiction of the redundancy and gaps in prior attempts at building a basic thematic structure, compared with the 65 nodes created by the text analytics methodology. The yellow cells represent themes from prior analysis, where redundancies are captured in the repeated yellow cells in columns to the right.
  • With 65 nodes there is a need to synthesize the essential story for business leaders. The pyramid depicts the method whereby the more general themes are distilled and presented, followed by subsequent inclusion of more detail.
  • While the rules incorporated in this example suggest inordinate complexity, the platform used in this case study makes building these rules relatively simple. The tool interface offers immediate feedback on work frequencies, different ways of interrogating word patterns and context, and offers drag and drop functionality, so that repetitive typing of chosen word groups is not required.
  • Heat maps offer a simple method of simplifying the results of the text analytics. The dimensions of volume and sentiment can be easily organized into cells that suggest particular tactical or strategic considerations.
  • View in slide show mode. This will enable the hyperlinks embedded in the heat map.
  • Co-occurrences are instances where a statement includes more than one theme. This pattern indicates some fundamental relationship between the themes. The relationship can be as simple as concomitant “top of mind” or a more sophisticated interweaving of multiple themes that express together something more than what is possible with a single theme.The relevance of co-occurrences are critical to understanding broad concepts like value and quality. The extent to which a theme is intermingled indicates the complexity and richness of the concept and the role that it plays in the articulation of customer experience. Identifying the linkages between themes is instrumental for the formulation of effective messaging and positioning. Such linkages are the points of resonance that can amplify or reinforce the relationship between a product position and the way in which the audience experiences the problem or context in which that product is relevant. In this study, value is used in an exclusive or discrete manner. Conversely, quality is more inclusive and is often intermingled with themes that cross domains. For example, when value is coterminous with software or pricing there is minimal overlap with other themes. Even when there are overlaps, they seem enhance specificity. For consulting value the linkage is with offerings. For hardware value it is with servers. For contract value it is with “ease of doing business – flexibility and responsiveness”.Continue (1of2)
  • Continued (2 of 2)Unlike the concept of value, quality plays a more holistic integrative role. Account management quality co-occurs with solution quality and service and technical support quality. Hardware and software quality demonstrates strong interconnectedness and additional connections with solution quality and support quality. The exceptions to this tendency towards quality inclusivity is where software quality shows co-occurrence with software improvement and where hardware quality co-occurs with servers.A marketing and positioning strategy can use these insights by building a word web and thematic bridges around its messaging or salestrack. The reporting environment within the text analytics platform makes such a task both feasible and effective. Drilling down into themes and examining the actual way in which participants articulate quality or value becomes a fecund environment for messaging content. Moreover, understanding the inclusivity or exclusivity of a concept is important when determining the most appropriate symbols to integrate into messaging content. For example, if quality is the primary focus of the message then broad inclusive symbols that encapsulate the corporate portfolio may be more effective than focusing on a single area. Conversely, focusing on value would be better served with crisp messages that target a particular product or service.
  • The themes associated with text can be linked to structured data. In the following examples there are structured variables for which it is important to view specific heat maps for each category or for particular parts of a scaled response. The next two heat maps depict heat maps for participants in the study who responded at the extreme ends of a scale where the participant was asked how likely would they repurchase the products or services offered by the vendor.
  • Placement in a cell is based on the rate of change from year one to year two.
  • The themes associated with text can be linked to structured data. In the following examples there are structured variables for which it is important to view specific heat maps for each category or for particular parts of a scaled response.
  • Text Mining Summit 2009 V4 (For General Presentations)

    1. 1. Harris Interactive Text Analytics Click to edit Master title style Strategic Insight and Text Analytics
    2. 2. Harris Interactive History • Harris Poll – longest running US proprietary poll (since 1963) • Leading pioneer for online panels and online surveys in 1990s • 2005 – ranked 13th largest market research firm worldwide (ESOMAR) 2
    3. 3. Qualitative and Quantitative Streams actionable prescription needs rich detail projectable generalizations require methodical validation and significance testing. 3
    4. 4. Why Text Analytics is becoming Critical • Provides a much needed validation of quantitative analysis; and can be used to help understand what respondents mean when they provide a scaled response • Offers a more contextualized analysis where emerging themes are less driven by a priori research categories and concepts • Classification can be more sophisticated and a less taxing method of building consistent thematic structures • Catalogued text which can be easily organized in hierarchies and interlocking relationships, which more accurately reflect how respondents organize their conceptual world • Sentiment which can be associated with thematically organized data and provide insight into dispositions that influence stated or observed behavior 4
    5. 5. Scalability and Replication • Text analytics software offers lower costs for handling large volumes of content rich data • Natural language processing, and domain specific training sets give reliable replication and updating of analysis • Iterative testing of thematic structures ensures systematic validation of themes and their analytical relevance. 5
    6. 6. Text Analytics Output • Classification: – Thematic Structure – Links to detailed comments • Volume: – Count for Documents, Sentences and Authors • Sentiment: – Positive and Negative Content 6
    7. 7. Case Study B2B Survey with Top Global Accounts 7
    8. 8. Context • Data Source – ~5,000 B2B interviews annually (multiple interviews per account) – ~120 structured questions & ~5 open comments per interview • Prior analysis: – Main exploration and analysis dependent on structured variables – Statistical approach limited to simple correlations and comparison of average scores – Use comments as anecdotes to support a priori hypotheses about account issues 8
    9. 9. Key Issues • Structured variables not providing actionable insights • Unstructured comments too difficult to theme effectively 9
    10. 10. Issues with Structured Variables: Coefficient Graph 0.8 Comparing Multivariate and Bivariate Analysis of Drivers of Loyalty 0.7 0.6 0.5 0.4 Coefficients for 2008 0.3 ZeroOrder CORR CATREG Optimal Scale 0.2 CR Scaled Zero Order Corr 0.1 0 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 -0.1 -0.2 -0.3 Coefficients for 2007 10
    11. 11. Issue of Thematic Complexity • No consistent story emerged in original analysis, despite having fewer themes • Comments were presented anecdotally and loosely mapped to a priori groups of structured questions From Original Harris Reports Classification Number of themes identified 44 65 Number of insignificant or redundant themes 16 0 Number of unique themes 28 65 11
    12. 12. Overcoming Classification Redundancy Text Analytics Child Categories Section 1 (original report) Section 2 (original report) Section 3 (original report) Section 4 (original report) Account Mgmt - Communication Account Mgmt - Knowledge & Expertis e A/ BR: Account Teams Viewed As Trus ted Advis ors Adding Strategic Value AM: Acct Mgr Adds Strategic Value Account Mgmt - Local Account Mgmt - Res pons ivenes s Account Mgmt - Quality Account Mgmt - Support RV: Empower Acct Mgr / Acct Team Support Account Mgmt - Problem Res olution Cons ulting Srvc - Value Cons ulting Srvc - Offerings Cons ulting Srvc - Knowledge & Expertis e Contracts - Pos t-Sale Contracts - Commitment Contracts- Quality Contracts - Value Fulfillment - Commitment Fulfillment - Speed H ardware - Computers / Laptops H ardware - Improvements H ardware - Innovation H ardware - Printers H ardware - Quality & Reliability PQR: H ardware Quality / Reliability H ardware - Server H ardware - Value H ardware - OS H ardware - Virtualiz ation Relations hip Value - Executive Engagement RV: Add Value / More Executive Engagement ABR: Effective Executive Engagement Relations hip Value - Local Text Relations hip Value - Partnering Relations hip Value - Communicate Relations hip Value - Long-term View Service & Technical Support - Flexibility & Res pons ivenes s S&S: Efficiency & Acces s ibility S&S: Prompt Analytics Service & Technical Support - Quality Service & Technical Support - Proactive Service & Technical Support - Local Service & Technical Support - Price Matching Unique and Redundant Software - Capabilities Child Software - Quality & Reliability Software - Improvements Software - Value Solutions - Quality & Reliability Themes from Original Analysis PQR: Software Quality / Solutions Themes Solutions - Value Solutions - Comprehens ive Solutions - Innovation Solutions - Adaptive Solutions - Price G en - Pricing Improve TCO / Reduce Prices H elping Manage TCO G en - Value PCV: Solutions Offer Compelling ROI PCV: Services Deliver Meas urable Value Integrated Solutions - H ardware Integrated Solutions - Software Integrated Solutions - H ardware & Software Integrated Solutions - G eneral IS: Integrated Solutions Reliability IS: Deliver Promis ed Benefits IS: More Integrated Solutions IS: Seamles s Integration of Solutions Eas e of Doing Bus ines s - Fewer Silos / Less Bureaucracy EODB: Fewer Silos / Les s Bureaucracy EODB: Act Cohes ively EODB: G lobal Cons is tency Eas e of Doing Bus ines s - Owners hip / Follow-through EODB: Owners hip / Follow-through RV: Proactively Propos e Solutions Eas e of Doing Bus ines s - Flexibility & Res pons ivenes s EODB: Flexibility and Res pons ivenes s Eas e of Doing Bus ines s - Simplify Proces s es (Pricing, Quoting, Ordering, EODB: Simplify Proces s es (Pricing, Quoting, Ordering, Invoicing) EODB: T&Cs ; Quote TAT; Invoice Clarity EODB: Contract Proces s Invoicing, Contracts , Terms ) U nders tanding Bus ines s N eeds U BN : Bus ines s U nders tanding U BN : Ability to U nders tand Critical Succes s Factors / Priorities Cus tomer Communications & Education - Provide Roadmaps Communications : Roadmaps Enable Planning Cus tomer Communications & Education - Provide Training, Seminars , Education Cus tomer Communications & Education - G eneral Communications : Effective Communications Depth & Breadth of Technology Portfolio TPK: Breadth & Depth of Technology Portfolio Technology Experience & Expertis e TPK: Knowledge, Expertis e & Experience Brand, Pres ence & Credibility TPP: Brand, Pres ence and Credibility G lobal Coverage - Account Mgmt G lobal Coverage - Solutions G C: Addres s G lobal Needs G C: G lobal Capabilities G lobal Coverage - Support & Technical Service 12
    13. 13. Analysis Plan to Address Issues Big Picture Use Heat Maps Identify Leading Themes Linkages between Themes Co-occurrence Narrative Structure of Themes CHAID Statistical Validation 13
    14. 14. Step One: Effective Classification Creating a thematic structure from case study data is first step 14
    15. 15. Key to Success: Detailed Classification Classification Tree Case Study Example Understand Ease of Breadth of Experience Integrated Specific Brand Doing Technology and Solutions Business Credibility Business Portfolio Expertise Needs See Training Set Example 15
    16. 16. Rules for Training Sets: Integrated Solutions - Breakdown 16
    17. 17. Rules for Training Sets: Ease of Doing Business - Breakdown 17
    18. 18. Step Two: Heat Maps on Volume and Sentiment Building a Market Story from a Thematic Structure 18
    19. 19. Why Heat Maps? Current • Heat maps provide a simple method of Hi identifying actions that are directly related Sentiment to the thematic structure of customer Med discourse. • Text analytics provides a method of Low measuring both the current state of Low Med Hi sentiment and volume of mentions. Emerging Volume • The thematic integrity of the classification Hi scheme permits year over year analysis of changes in sentiment and volume. Sentiment Med • Combining Current (e.g., 2008) and Emerging (i.e., year over year changes) Low patterns suggests ways to address marketing, brand and product positioning. Low Med Hi Volume Note: Year over year change 19
    20. 20. How to interpret current state Heat Maps   Brand/Position Hi Latent Core Attractors Leverage – continue to Sentiment Themes: Med build to maintain requires top market position of mind uplift   Low Brand Volume Low Med Hi Brand/Position Appendages: Detractors – Requires top of need to address mind and negative spin sentiment uplift 20
    21. 21. Heat Map Criteria Allocation of Themes Box Names S>.85 S>.85 S>.85 Hi Hi V<.15 V ~.15- V>.85 1 2 3 .85 Sentiment Sentiment S~.15- S~ .15- S~.15- Med Med .85 .85 V ~.15- .85 4 5 6 V<.15 V>.85 .85 S<.15 Low Low S<.15 S<.15 V<.15 V ~.15- V>.85 7 8 9 .85 Low Med Hi Low Med Hi Volume Volume Criteria for theme assignment uses 15th and 85th percentiles as parameters. 21
    22. 22. Interpretation: Emerging Problems Emerging Problems • Emerging Pattern E Example – Sentiment (Low) means large downward change in average Hi – Volume (Hi) means large increase in number of mentions Sentiment • Current State C C Med – Sentiment (Med) means mix of positive and negative dispositions – Volume (Med) average number of mentions Low • Interpretation E – What are currently moderate issues may become critical issues by next year if trend Low Med Hi continues Volume – There is a frequent top of mind mention that requires remedial action not messaging 22
    23. 23. Interpretation: New Positioning Develop New Positioning • Emerging Pattern E Example – Sentiment (Hi) means large upward change in average Hi – Volume (Hi) means large C E 1 increase in number of mentions Sentiment • Current State C Med – Sentiment (Med-Hi) means mix of positive 5 C 6 and negative dispositions – Volume (Med-Hi) above average number Low of mentions • Interpretation – Upward trends in cells 1,5 & 6 mean new Low Med Hi positive themes are emerging that tactical Volume messaging can reinforce and connect to current brand anchors 23
    24. 24. Interpretation: Diminishing Relevance Diminishing Relevance • Emerging Pattern E Example – Sentiment (Lo) means large downward change in average Hi – Volume (Lo) means large decrease in C number of mentions 1 Sentiment • Current State C Med – Sentiment (Hi) highly positive dispositions – Volume (Med) average number of mentions 4 5 6 • Interpretation Low – Downward trend in both average sentiment and number of mentions, which means that while the E positive disposition is eroding, the relevance of the theme may become less significant over time Low Med Hi – Shoring up this characteristic may have a low ROI, look for positive changes in themes from Volume 1,4,5 & 6. 24
    25. 25. Heat Maps Case Study Story 25
    26. 26. Total Market: Hyperlink Heat Map Current State 2008 Big Picture • Safe and reliable but stodgy and not effectively improving Hi 1 2 3 products and services • Has effective account Sentiment management which represents Med 4 5 6 value but can be seen as maintaining the status quo • Strategic partnering is Low 7 8 9 hampered by a service orientation that does not Low Med Hi effectively communicate an Volume improvement-consultative model 26
    27. 27. Current Overall Client Experience Strengths Threats • Brand – broad portfolio & • Weak value proposition consistent credibility – Software, Consulting & • Strong account management Contracts – Responsive, brings value & • Failure to improve through expertise innovation and product evolution • Delivering quality and reliability – Lagging developments in – Hardware and software software and hardware – Solutions – Weak business follow-through – Service and Technical Support – Problems communicating visionary roadmaps and functionality 27
    28. 28. Understanding Interleafing Strengths Quality and Reliability Account Management • Co-occurrence Analysis • Co-occurrence Analysis – Quality for solutions, support, SW – Account responsiveness is a and HW are integrally linked function of communication and – Product quality is reinforced by problem resolution strong account management – Account management quality is – Product quality has concrete tied to support and underpinnings e.g., HW quality responsiveness shows links to OS issues, which in – Account capabilities has turn are tied to servers and connections to local account emerging virtualization issues support and service support 28
    29. 29. Buffered by Account Management Current State 2008 Account Mgt Themes Responsiveness Hi 1 2 3 Quality Knowledge & Expertise Sentiment Med Local Relationship Value 4 5 6 Support Communication Low 7 8 9 Local Problem Resolution Low Med Hi Volume 29
    30. 30. Understanding Interleafing Threats Building a Sense of Value Ongoing Improvement • Co-occurrences • Co-occurrences – Value disassociated from quality – Disconnect between innovation and reliability and improvement in SW – More narrowly focused or linked to – Co-occurrence with integrated specific products and services solutions and SW capabilities is – HW value co-occurs with SW not improving view of of SW value suggesting mutual improvement reinforcement – Both improvement and innovation – HW value aligns with discussion of dimensions are poorly positioned servers and HW improvements – HW improvement is linked to a a – Positive server position does not highly commoditized technology offset poor position of HW (i.e., printers) limiting uplift improvement, which undermines HW value 30
    31. 31. Stunted Improvement/Innovation Issues Current State 2008 HW Innovation Hi 1 2 3 Solution Innovation Sentiment Med SW Improvement 4 5 6 HW Improvement Low 7 8 9 Low Med Hi Volume 31
    32. 32. Structured Data: Repurchase Likelihood to repurchase – scaled response 32
    33. 33. Repurchase: Top box and Bottom Box  New Insights • High volume positive themes of top box clients  Similar to general market – Very positive reaction to account management: responsiveness, support, quality, expertise – Very positive perceptions of quality and reliability • High volume negative themes of bottom box clients – Lack simplified processes and follow-through (EOB) – Reactive service model Very different pain points – Lack executive engagement and consultative value – Fail to emphasize improvement  Very different pain points 33
    34. 34. Hyperlink Heat Map: Repurchase Top Box Current State 2008 Hi 2 3 Sentiment Med Low 8 9 Low Med Hi Volume 34
    35. 35. Hyperlink Heat Map: Repurchase Bottom Box Current State 2008 Hi 2 3 Sentiment Med Low 8 9 Low Med Hi Volume 35
    36. 36. Identifying Emerging Trends Year over year changes in volume and sentiment 36
    37. 37. Themes with Positive Uplift in Volume and Sentiment • Large increases in volume and positive sentiment – Brand presence and global solutions – SW quality – HW virtualization • Large increases in volume, but large decreases in positive sentiment – Integrated Solutions - Software – Ease of Doing Business - Simplify Processes (Pricing, Quoting, Ordering, Invoicing, Contracts, Terms) Increasingly Positive Increasing Chatter Increasingly Negative 37
    38. 38. Hyperlink Heat Map: Total Market Year Over Year Changes Emerging Trends Hi 2 3 Sentiment Med Low 8 9 Low Med Hi Volume 38
    39. 39. Linking Current and Emerging Patterns 39
    40. 40. Enhancement and Positioning Opportunities Integrated Solutions – HW S&T Support - Proactive Trend Diagnostics HW - Virtualization Hi Acct Mgt - Local 1 2 3 Sentiment • Local account management Med has latent potential on small 4 A 5 H&V 6 scale • Positive development for core Low strength in S&T support 7 8 9 • Look to key market drivers like virtualization to Low Med Hi strengthen HW and reinforce positive shifts in integrated Volume solutions 40
    41. 41. Immediate Remedial Action Required HW –Innovation (1) HW – Improvements (2) Trend Diagnostics Integrated Solutions – SW (1) SW – Improvement (2) Hi 1 2 3 EDB – Simplify Processes Sentiment E Med • SW (improvement) is not 4 5 6 showing signs of emerging H1 H2 from the red zone • EDB (simplify) and Integrated Low Solutions (SW) show potential 7 S2 8 1 S 9 entrenchment in the red zone • HW (innovation and Low Med Hi improvement) see erosion in Volume their moderate, trending to red zone 41
    42. 42. Heat Maps: 2008 Account Type Few Clients AND Many Clients 42
    43. 43. Hyperlink Heat Map: 2008 Account Type: Linking Themes to Structured Variables One to Few Hi 2 3 Sentiment Med Low 8 9 Low Med Hi Volume 43
    44. 44. Hyperlink Heat Map: 2008 Account Type: Linking Themes to Structured Variables One to Many Hi 2 3 Sentiment Med Low 8 9 Low Med Hi Volume 44
    45. 45. Heat Maps for Relationship Strategic Partner – Solutions Provider – Transactional 45
    46. 46. Hyperlink Heat Map for Relationship 2008: Linking Themes to Structured Variables Strategic Partner Hi 2 3 Sentiment Med Low 8 9 Low Med Hi Volume 46
    47. 47. Hyperlink Heat Map for Relationship 2008: Linking Themes to Structured Variables Solution Provider Hi 2 3 Sentiment Med Low 8 9 Low Med Hi Volume 47
    48. 48. Hyperlink Heat Map for Relationship 2008: Linking Themes to Structured Variables Transactional Hi 2 3 Sentiment Med Low 8 9 Low Med Hi Volume 48
    49. 49. Predictive Trees Statistical validation of core product and service thematic relationships to sentiment for total market 49
    50. 50. CHAID Tree for Sentiment 1- Low (4377) 22.8% 2 - Mid (5142) 26.8% 3 - High (9651) 50.3% Total 19170 CATG15 - 15 Software P=0.000000 CHI=187.835417; DF=2 SW 0 1 1- Low (4121) 22.1% 1- Low (256) 46.5% 2 - Mid (5014) 26.9% 2 - Mid (128) 23.3% 3 - High (9485) 50.9% 3 - High (166) 30.2% Total 18620 97.1% Total 550 2.9% CATG10 - 10 Hardware CATEGORY - P=0.000476 P=0.000000 CHI=19.905137; DF=2 CHI=166.054857; DF=6 HW 0(16524) 1(2096) Software / Software - Capabilities(117) Software / Software - Improvements(145) Software / Software - Quality & Reliability(117) Software / Software - Value(171) 1- Low (3636) 22.0% 1- Low (485) 23.1% 1- Low (34) 29.1% 1- Low (119) 82.1% 1- Low (17) 14.5% 1- Low (86) 50.3% 2 - Mid (4379) 26.5% 2 - Mid (635) 30.3% 2 - Mid (43) 36.8% 2 - Mid (15) 10.3% 2 - Mid (25) 21.4% 2 - Mid (45) 26.3% 3 - High (8509) 51.5% 3 - High (976) 46.6% 3 - High (40) 34.2% 3 - High (11) 7.6% 3 - High (75) 64.1% 3 - High (40) 23.4% Total 16524 86.2% Total 2096 10.9% Total 117 0.6% Total 145 0.8% Total 117 0.6% Total 171 0.9% CATG16 - 16 Solutions CATEGORY - P=0.000000 P=0.000000 CHI=238.887402; DF=2 CHI=608.919718; DF=6 0(13296) 1(3228) Hardware / Hardware - Computers/ Laptops(216) Hardware / Hardware - Improvements(249) Hardware / Hardware - Innovation(30) Hardware / Hardware - Quality & Reliability(518) Solution Hardware / Hardware - OS(27) Hardware / Hardware - Value(333) 1- Low (3029) 22.8% 1- Low (607) 18.8% Hardware / Hardware - Virtualization(94) 1- Low (188) 75.5% Hardware / Hardware - Printers(70) 1- Low (43) 8.3% 2 - Mid (3801) 28.6% 2 - Mid (578) 17.9% 2 - Mid (37) 14.9% Hardware / Hardware - Server(559) 2 - Mid (91) 17.6% 3 - High (6466) 48.6% 3 - High (2043) 63.3% 1- Low (74) 22.0% 3 - High (24) 9.6% 3 - High (384) 74.1% Total 13296 69.4% Total 3228 16.8% 2 - Mid (153) 45.4% Total 249 1.3% 1- Low (180) 18.1% Total 518 2.7% 3 - High (110) 32.6% 2 - Mid (354) 35.7% CATG6 - 6 Depth & Breadth of Technology Portfolio CATEGORY - Total 337 1.8% 3 - High (458) 46.2% P=0.000000 P=0.000000 Total 992 5.2% CHI=122.248266; DF=2 CHI=266.880808; DF=4 0(13035) 1(261) Solutions / Solutions - Adaptive(179) Solutions / Solutions - Comprehensive(325) Solutions / Solutions - Quality & Reliability(1690) Solutions / Solutions - Price(619) Solutions / Solutions - Innovation(134) 1- Low (3014) 23.1% 1- Low (15) 5.7% Solutions / Solutions - Value(281) 1- Low (161) 9.5% 2 - Mid (3770) 28.9% 2 - Mid (31) 11.9% 1- Low (110) 24.0% 2 - Mid (266) 15.7% 3 - High (6251) 48.0% 3 - High (215) 82.4% 1- Low (336) 31.1% 2 - Mid (76) 16.6% 3 - High (1263) 74.7% Total 13035 68.0% Total 261 1.4% 2 - Mid (236) 21.9% 3 - High (273) 59.5% Total 1690 8.8% 3 - High (507) 47.0% Total 459 2.4% CATG14 - 14 Service & Technical Support Total 1079 5.6% P=0.000000 CHI=64.536308; DF=2 0(10738) 1- Low (2565) 23.9% 2 - Mid (3198) 29.8% 3 - High (4975) 46.3% Total 10738 56.0% Technical 1(2297) 1- Low (449) 19.5% 2 - Mid (572) 24.9% 3 - High (1276) 55.6% Total 2297 12.0% CATG4 - 4 Cost/ Price/ Value - General CATEGORY - Support P=0.000000 P=0.000000 CHI=155.188571; DF=2 CHI=147.782652; DF=4 0(9342) 1(1396) Service & Technical Support / Service & Technical Support - Flexibility & Responsiveness(648) Service & Technical Support / Service & Technical Support - Local(92) Service & Technical Support / Service & Technical Support - Quality(1099) Service & Technical Support / Service & Technical Support - Proactive(66) Service & Technical Support / Service & Technical Support - Price(392) 1- Low (2377) 25.4% 1- Low (188) 13.5% 1- Low (149) 13.6% 2 - Mid (2611) 27.9% 2 - Mid (587) 42.0% 1- Low (202) 28.3% 1- Low (98) 20.2% 2 - Mid (201) 18.3% 3 - High (4354) 46.6% 3 - High (621) 44.5% 2 - Mid (214) 30.0% 2 - Mid (157) 32.4% 3 - High (749) 68.2% Total 9342 48.7% Total 1396 7.3% 3 - High (298) 41.7% 3 - High (229) 47.3% Total 1099 5.7% Total 714 3.7% Total 484 2.5% CATG3 - 3 Contracts P=0.000000 CHI=122.920354; DF=2 0(8923) 1(419) 1- Low (2175) 24.4% 1- Low (202) 48.2% 2 - Mid (2515) 28.2% 2 - Mid (96) 22.9% 3 - High (4233) 47.4% 3 - High (121) 28.9% Total 8923 46.5% Total 419 2.2% CATG9 - 9 Global Coverage CATBIG14 - 14 Contracts Value P=0.000286 P=0.005479 CHI=20.921843; DF=2 CHI=15.018692; DF=2 Validating thematic patterns and sentiment 0(8446) 1(477) 0(159) 1(260) 1- Low (2097) 24.8% 1- Low (78) 16.4% 1- Low (62) 39.0% 1- Low (140) 53.8% 2 - Mid (2350) 27.8% 2 - Mid (165) 34.6% 2 - Mid (34) 21.4% 2 - Mid (62) 23.8% 3 - High (3999) 47.3% 3 - High (234) 49.1% 3 - High (63) 39.6% 3 - High (58) 22.3% Total 8446 44.1% Total 477 2.5% Total 159 0.8% Total 260 1.4% CATG5 - 5 Customer Communications & Education P=0.000000 CHI=44.619311; DF=2 0(7821) 1(625) 1- Low (1889) 24.2% 1- Low (208) 33.3% 2 - Mid (2152) 27.5% 2 - Mid (198) 31.7% 3 - High (3780) 48.3% 3 - High (219) 35.0% Total 7821 40.8% Total 625 3.3% CATG1 - 1 Account Mgmt CATEGORY - P=0.000000 P=0.000127 CHI=196.377570; DF=2 CHI=40.295372; DF=2 0(6965) 1(856) Customer Communications & Education / Customer Communications & Education - General(353) Customer Communications & Education / Customer Communications & Education - Provide Roadmaps(135) Customer Communications & Education / Customer Communications & Education - Provide Training, Seminars, Education(137) 1- Low (1769) 25.4% 1- Low (120) 14.0% 1- Low (83) 23.5% 2 - Mid (2023) 29.0% 2 - Mid (129) 15.1% 2 - Mid (140) 39.7% 1- Low (125) 46.0% 3 - High (3173) 45.6% 3 - High (607) 70.9% 3 - High (130) 36.8% 2 - Mid (58) 21.3% Total 6965 36.3% Total 856 4.5% Total 353 1.8% 3 - High (89) 32.7% Total 272 1.4% CATG2 - 2 Consulting Services P=0.000000 CHI=77.011450; DF=2 0(6804) 1(161) 1- Low (1681) 24.7% 1- Low (88) 54.7% 2 - Mid (2003) 29.4% 2 - Mid (20) 12.4% 3 - High (3120) 45.9% 3 - High (53) 32.9% Total 6804 35.5% Total 161 0.8% 50

    ×