Know Your Market - Know Your Customer: What Web Data Reveals if You Know Where and How to


Published on

In this presentation, Connotate will share expertise gained from years of experience extracting data from the Web and making it usable. Connotate’s experts will explain why certain Web data sources are easy to tap into, why others aren’t – what to consider when scoping out a project.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Know Your Market - Know Your Customer: What Web Data Reveals if You Know Where and How to

  1. 1. Know Your Market – Know Your Customer:What Web data reveals if you know where & how to lookPresenters: Christian Giaretta, VP of Sales Engineering, ConnotateDennis Clark, Chief Strategy Officer, LuminosoModerator: Gina Cerami, VP of Marketing, ConnotateDate: November 1, 2012
  2. 2. PresentersChris GiarettaVice President of Sales EngineeringDennis ClarkChief Strategy Officer2
  3. 3. Today’s Discussion• What Web Data Reveals: The Fundamentals• The business case• Where to start? Best practices and the automation process• Know Your Market• Use cases: market transparency, digital strategy, PDF extraction• Differences in data sources• Know Your Customer: Part 1• Use case: online advertising - aggregating customer response to ads• Manual versus automated approaches• Know Your Customer: Part 2• Text analysis – overview of options• Concept-based text analysis• Use case: consumer packaged goods• Other considerations• Q&AQ&A3
  4. 4. What Web Data Reveals:What Web Data Reveals:The Fundamentals4
  5. 5. The Business Casenews – data points – public noticestrillions of URLstrillions of URLsonline conversations5
  6. 6. IDC Research – October 2012• CEOs are looking at Big Data on the Web to understandtheir markets and customers• The number of sites with valuable content continues toexpand at a tremendous rate• Factors to consider when collecting Web data• Timeliness• LegitimacyLegitimacy• Aggregation6
  7. 7. Can I Trust Web Data for Market Research???Good question! You may have to…factors to consider:• It’s harder and harder to get people to answer surveysFocus groups take time which you may not have• Focus groups take time – which you may not have• Proprietary data sources may not answer all of yourimportant questionsimportant questions• Organizations and government agencies are moving moreand more data, content and forms onto the Web7
  8. 8. Can I Trust Web Data for Market Research???Timely?YES!!Aggregate?YES!!Legitimate?Uhh…S Refresh primaryresearch Expose newYES!! Volumes of datareveal insights The longer youUhh… Be vigilant aboutspam and bias inWeb data Expose newtrends orquestions rapidly The longer youretain it, the morevaluable it getsWeb data Some sites arebetter than others8
  9. 9. Polling Question: Web Data CollectionAre you currently collecting data from the Web?Are you currently collecting data from the Web?Yes – we are doing this using an automated processYes – however, we are collecting Web data using a manual processNo – we are not collecting Web data
  10. 10. Where to Start? Follow Proven Best PracticesWork with experts with deep experience evaluatingWeb sources for data extraction to help youWeb sources for data extraction to help you…• Clarify “What do you really want to do with this data?”D id hi h it t t t• Decide which sites to target• Identify how easy or difficult it will be to extract data from target sitesO tli th f th j t• Outline the scope of the project• Estimate long-term maintenance costs (and how to minimize them)10
  11. 11. Best Practices (cont’d)• Narrow your search• Scope the project• Think about the long term11
  12. 12. An Overview of the Automation ProcessTransform Deliver• StructureClassify• ReportsDashboardsCollect DataInternal Sources• DatabaseExternal Sources• Social Media • Classify• Prep for Analysis• Dashboards• Workflow• BI Plug-ins• Database• Market Basket• Inventory, etc.• Social Media• Surface Web• Hidden Web•Secured Sites12
  13. 13. Know Your Market: Use Cases13
  14. 14. Know Your Market: Use CasesGovernment RegulatorySite Updates (PDFs)Digital StrategyMarket TransparencySite Updates (PDFs)• Insurance coverage,building permits, etc.posted as PDFs can• Paid ads, searchterm rankings onGoogle trended over• Job postings, etc. oncompany Web sitesmay offer indicators of posted as PDFs canreveal insight intomarket trends andproduct salesGoogle trended overtime reveal insightsabout competitors’digital strategiesmay offer indicators ofperformance beforequarterly results arereported product salesdigital strategiesreportedAutomated, precise data collectionis key to success1414
  15. 15. Know Your Market: What Job Postings Reveal
  16. 16. Know Your Market: Competitor’s Digital Strategies16
  17. 17. Building Permits Reveal Construction ActivityAP_Title Mr &MrsAP_Forename Samuel JohnAP Surname MacNaughtonAG_RefNoAG_Forename SarahPDFAP_Surname MacNaughtonAP_CompanyNameAP_Building OranaAP_AddressLine1 Easter KinkellAP_AddressLine2 DingwallAP_Town Ross‐ShireAG_Surname BrydenAG_CompanyNameAG_Building 12AG_AddressLine1 Southside RoadAG_AddressLine2AG_Town InvernessExcel17AP_Postcode IV7 8HY AG_Postcode IV2 3AUExcel
  18. 18. Insurance Coverage Predicts Drug SalesDrug Name Tier/bPDF Document Excel FileA/b otic 2Abilify 4Accolate 4Accupril 4A ti 4Accuretic 4Accutane 4Acebutolol HCL 2Aceon 4 (1/2)Acetaminophen w/ codeine 2Acetaminophen w/ codeine 2Acetasol HC 2Acetazolamide 2Aciphex XAclovate ointment 4Aclovate ointment 4Acticin 2Activella 4Actonel 4Actoplus met 318Actoplus met 3Actos 3
  19. 19. Benefits of Using Automation to UnderstandMarkets and Market-Moving Eventsg• Reduce costs associated with manual processes• Speed up processes by doing this continually insteadof sporadically• Improve accuracy• Repurpose data for new uses byconverting PDFs and otherconverting PDFs and otherunstructured data into a Excel,XML or other usable formats19
  20. 20. Differences in Web Sources20
  21. 21. Automation Opens Access to Deep Web and Secured Sites21
  22. 22. Know Your Customer: Buyer Behavior22
  23. 23. Altitude Digital – Buyer Behavior in Real Time• Push the boundaries of “Big Data” in interactive advertising• Use Connotate to collect real-time Web dataUse Co otate to co ect ea t e eb data• Increase clients’ ad revenues by 30% - 300%Continually display aggregated dynamic ad exchange data• Continually display aggregated dynamic ad exchange data• Publishers view real-time, side-by-side comparisons of online ad traffic• They can instantaneously optimize ad placementThey can instantaneously optimize ad placementMany of these sites are password-protected….Many of these sites are password protected….not a problem!23
  24. 24. 24
  25. 25. Manual versus Automated ApproachesYour Data Needs To Automate or Not?? May want to considerComplex product-matching tasks? May want to considercrowd sourcingSmall amount of data, needed a few ? A manual approach maytimes per yearpp ysufficeSpecific external data (under $5K/year) ? Purchase from 3rd partyHigh volume data monitoring  AutomateVariety of sources  AutomateFrequent updates and/or monitoring  AutomateNeed for data post-processing  Automate25Need for data post processing Automate
  26. 26. A Closer Look at Different ApproachesApproach ConsiderationsManual offshore No economies of scale; human error compromises quality.CrowdsourcingA viable approach for complex tasks like product matchingof apparel for one-shot projects; may be less reliable forCrowdsourcing of apparel for one shot projects; may be less reliable forongoing monitoring and long-term projects.In-house or low-costWeb scrapersNot resilient; scrapers break when Web page HTMLchanges, creating a maintenance headache; scrapersWeb scrapersg , g ; pmay not monitor well or support scheduling.Robust automationinstalled on-premiseHigh degree of control; better resiliency to change but shouldconsider project complexity and future need to add new Webinstalled on premisesources on short notice.Robust solution hostedby vendorHighest resiliency; no maintenance burden; 24/7 follow-the-sun support; infinitely scalable and no capital expendituresfor hardware or IT resources26yfor hardware or IT resources.
  27. 27. Polling Question: Data AnalysisWhat type of data analysis tools do you use?What type of data analysis tools do you use?Only basic tools – Excel spreadsheets, etc.Text analysis and basic toolsApplications built in-house and basic toolsppNone
  28. 28. Know Your Customer: Sentiment Analysis28
  29. 29. Text Analysis OptionsMain ‘Schools’ of Text AnalyticsMachine LearnersUnderstanding through Data•Learn meaning through correlationsOntologistsUnderstanding through Instruction•People tell computers what words meanLuminoso ApproachConcept-based text analysis•Know the “Common Sense” about the worldAdd ti f d t t•Add new connections from datasets29
  30. 30. Language is CreativeIt was really stuffy. It smelled terrible.It was like it had Smells like anbeen shut awayfor a long time.old house.Smelled really musty.Was like a wet dog.Reminds me ofa dusty closet.Really stale.
  31. 31. Concept-based analytics has…• Shown how reaction to product scent changesShown how reaction to product scent changeswith price point• Determined the customer segments for a sportsDetermined the customer segments for a sportsWeb site• Discovered if customers notice unannouncedin-store policy changes• Matched those who should connect at a largegenterprise software company’s user conference
  32. 32. Digital IntuitionWe boil down the meaning of text into actionable,mathematically justifiable insights.
  33. 33. Speed and ScaleBig Data  Small Data  Streaming  Dynamic
  34. 34. Case Study: Swiffer SweeperVacConsumer product design example:S iff S VSwiffer SweeperVacIdeaUse social data on Twitter to understand customerreactions to product designp gResult Failure. Twitter lacks depth.Better Idea Product Reviews34
  35. 35. Obtaining Customer Sentiment from YouTubeManually search YouTube for <“product name”> <“review”>Use the Connotate automation package to follow linksto individual video reviews and more resultsUse Connotate to extract comment textFeed input into analytical engine to reveal sentimentG hi l U I t f /P t ti f I i ht35Graphical User Interface/Presentation of Insights
  36. 36. Swiffer Dataset36
  37. 37. Swiffer Features:37
  38. 38. The Value of the Data is in the Delivery38
  39. 39. Another Look at the Automation ProcessConnotatePartnersTransform DeliverCollect DataConnotate Connotate• Classify• Structure• Prep for Analysis• Reports• Dashboards• WorkflowInternal Sources• Database• Market BasketExternal Sources• Social Media• Surface WebHidd W bPrep for Analysis Workflow• BI Plug-ins• Inventory, etc. • Hidden Web•Secured Sites• Connotate provides precise quality data, structuredfor delivery to your analysis and presentation tools.• Connotate maximizes the value of your investmentin business intelligence, text analytics and semanticanalysis tools. Excel39
  40. 40. Web Data Can Reveal Insights ofTremendous ValueTremendous ValueValid insightsrequire precise,quality dataAutomationreduces the costof monitoringWeb sites forAutomation isthe key toextractingWeb sites forupdatesAutomationk it ie t act gprecise,quality datamakes it easierto collect datafor trending40
  41. 41. Web Data Can Reveal Insights ofTremendous ValueTremendous ValueSpot markettrends fasterDetect shifts inDetect changesto regulatorysites, downloadPDFs andDetect shifts incompetitor’sdigital strategyPDFs andextract dataObtain newMonitor buyerbehavior onlineand in aggregateinsights intocustomerpreferences41
  42. 42. Q & AConnotate will email a link to this presentation as well as apcopy of the slides to you within 2 business days.If you have an immediate need and would like us to contactyyou about a forthcoming project, please check the appropriatebox in the last polling question or call (+1) 732-296-8844.For more information, you may also visit www.connotate.comor
  43. 43. Thank YouIf you have an immediate need and would like us to contactyyou about a forthcoming project, please check the appropriatebox in the last polling question or call (+1) 732-296-8844.For more information, visitwww connotate com or www connotate co or