Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Content Analytics
Insights from Unstructured Data
Mayank Tyagi
April 09, 2015
CONTENT ANALYTICS
UNLOCKS BUSINESS VALUE
FROM UNSTRUCTURED CONTENT
DELIVERING ANSWERS
TO IMPORTANT QUESTIONS
VIA SEMANTIC ...
Business Need
A large percentage (estimated at 80% or more) of the information in a company
is maintained as unstructured ...
Structured vs Unstructured Data
Column Value
Patient Joe Brown
Date of Birth 02/13/1972
Date
Admitted
02/05/2014
Structure...
Big Content
• Beyond conventional Big Data, there exists a
tsunami of information in the big data
universe that has largel...
Content Analytics approach leverages multiple algorithms to draw patterns and
identify insights from unstructured data
Con...
Text Analytics or Natural Language
Processing were a set of linguistic, statistical,
and machine learning techniques that ...
Identify meaning, trends, patterns, preferences, tastes,
from text for better business decision making
Understand the cust...
• 90% of the world’s data
was created in the last two
years
• 5 million trade events per
second
Key Challenges of Content ...
Content Analytics is used in many verticals and for various applications solving
varied business needs
Note: *This is just...
Content Analytics Solutions -
Industry Overview
Industry Overview
› Content Analytics solutions are usually evolutionary products
of Enterprise Content Management Solutio...
• Content Analytics market includes key players that provide purpose-built and job-
aligned offerings, including case mana...
Trends
 Increased focus on social media text
analytics as it is creating huge
amount of unstructured data.
 Large scale ...
Annexure
CONTENT
ANALYTICS
HOW DOES
WORK
AN EXAMPLE
?
17
Analyzing Unstructured Content – Text Analytics
Answering complex natural language questions requires more than keyword...
18
Analyzing Unstructured Content – Content Analytics
CA approach leverages multiple algorithms to draw patterns and ident...
Thank You
Upcoming SlideShare
Loading in …5
×

Content analytics

661 views

Published on

Basic overview about content analytic, industry overview and competitive dynamics of the industry

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Content analytics

  1. 1. Content Analytics Insights from Unstructured Data Mayank Tyagi April 09, 2015
  2. 2. CONTENT ANALYTICS UNLOCKS BUSINESS VALUE FROM UNSTRUCTURED CONTENT DELIVERING ANSWERS TO IMPORTANT QUESTIONS VIA SEMANTIC TECHNOLOGIES
  3. 3. Business Need A large percentage (estimated at 80% or more) of the information in a company is maintained as unstructured content, which includes valuable assets such as emails, customer correspondence, free-form fields on applications, wikis, blobs of text in a database, content in enterprise content repositories, social media posts, and messages of all kinds. Because this content lacks structure, it is difficult to search and analyze it without extensive effort and automation
  4. 4. Structured vs Unstructured Data Column Value Patient Joe Brown Date of Birth 02/13/1972 Date Admitted 02/05/2014 Structured Data High Degree of organization, such as a relational database Unstructured Data Information that is difficult to organize using traditional mechanisms “The patient came in complaining of chest pain, shortness of breath, and lingering headaches…smokes 2 packs a day… family history of heart disease…has been experiencing similar symptoms for the past 12 hours….”
  5. 5. Big Content • Beyond conventional Big Data, there exists a tsunami of information in the big data universe that has largely remained untapped • Big Data has morphed into a world of unstructured machine-generated data and human-generated content that is referred to as ‘Big Content.’ for example, chat logs, emails, documents, sales and service notes, CRM case notes, support tickets, weblogs, social media feeds, and more Content Analytics  Content analytics is the act of applying business intelligence and business analytics practices to this Big Content  Companies use content analytics software to provide visibility into the amount of content that is being created, the nature of that content and how it is used. This contextual value-adding information has remained under-used due to lack of recognition and inadequate technologies Big Content
  6. 6. Content Analytics approach leverages multiple algorithms to draw patterns and identify insights from unstructured data Content analytics solution processes textual data in ways that help to search, discover, and perform the same analytics on textual data that is currently performed on structured data in a business intelligence style of application. With Content Analytics Solutions, unstructured data can be used in ways that were only previously attainable from structured data sets Analyze unstructured content1 Content Analytics delivers new business understanding and visibility from the content and context of textual information. For example, it can identify patterns, view trends and deviations over time, and reveal unusual correlations or anomalies. It can explain why events are occurring and find new opportunities by aggregating the voices of customers, suppliers, and the market. Better business understanding & visibility2 Tool for reporting statistics and deriving actionable insights. With Content Analytics, solutions, we can define many facets (or aspects) of your data, with each facet potentially leading to valuable insights for various users. Content Analytics brings the power of business intelligence to the entire enterprise information, not just structured information(which is less than 20% of the entire enterprise repository) 3 Content Analytics Solutions
  7. 7. Text Analytics or Natural Language Processing were a set of linguistic, statistical, and machine learning techniques that allow text to be analysed and key information extraction for business Integration. However, it gave only answer to who, what, where and when of a subject? The why was left to subjective assessment only Traditional Approach – Text Analytics Evolution of Content Analytics Contemporary Solution – Content Analytics • Content Analytics (Text Analytics + Mining) refers to the text analytics process plus the ability to visually identify and explore trends, patterns, and statistically relevant facts found in various types of content spread across internal and external content sources. • Content analytics distinctively adds the why and the how and provides a comprehensive understanding of the world around the subject
  8. 8. Identify meaning, trends, patterns, preferences, tastes, from text for better business decision making Understand the customers on a granular level primarily due to to semantic and sentiment analysis Extract more value from your social media community by build a richer profile of each person on customer database Quickly identify trends amongst the customer base by filtering and giving structure to the data Reuse and curate content by analysing and curating content from partner organisations and external sources that are pertinent to the target market Customer-centric marketing: As content analytics can determine the interests of individual customers & prospects, so, for each person the content that is most relevant to them can be customized and personalised propositions can be delivered Content Analytics complements business intelligence to provide a more detailed and accurate understanding of market and customer needs € Content Analytics 1 2 3 4 5 6 Key Benefits of Content Analytics
  9. 9. • 90% of the world’s data was created in the last two years • 5 million trade events per second Key Challenges of Content Analytics Beyond Volume, Variety and Velocity is the Issue of Big Data Veracity Velocity Challenges of Content Analytics • 1 Trillion connected devices generate 2.5 quintillion bytes data / day • 12 terabytes of Tweets created daily Volume • With big data there is a tendency for errors to snowball e.g. user entry errors, redundancy and corruption all provide uncertainty & ambiguity to quality of data Veracity • Structured, unstructured, multimedia, text; varied content creation • 80% of the world’s data today is unstructured 1 3 2 Variety4
  10. 10. Content Analytics is used in many verticals and for various applications solving varied business needs Note: *This is just a representative list to showcase the capabilities of content analytics and not exhaustive Usage of Content Analytics Solutions* Examples of Business Problems that can be addressed  Market intelligence  Case management  Compliance  Risk scoring “What features of our Banking Services are most liked/hated by our customers?” Financial Services  Scientific discovery  Bio-surveillance  Clinical trials Healthcare and Life Sciences  Digital asset management  Content mining  Contextual advertising “What caused this recent drop in sales for Product X?” Media and Advertising Industry Solutions  Security  Intelligence  Digital library services  E-learning Education and Govt. “Give me a media profile of Mr. X including Trends, Quotes, Roles, Contacts etc. “ “Which regulatory causes and sentences from Past have hindered the objective of universal education?”
  11. 11. Content Analytics Solutions - Industry Overview
  12. 12. Industry Overview › Content Analytics solutions are usually evolutionary products of Enterprise Content Management Solutions providers. These solutions enable the management of business information throughout the content lifecycle, from creation to disposition. As a technical architecture, ECM consists of a platform or a set of applications that interoperate but that can be sold and used separately. › Content Analytics and ECM market will grow from $5.1 billion in 2013 to over $9.3 billion in 2017, at a CAGR of 16% over the period. › › Leading providers of content analytics solutions are IBM, Open text, EMC, Perceptive Software, Hyland, Microsoft and Oracle. Several other new entrants such as Xerox, Alresco and Newgen Software have also developed solutions which are rated highly by industry experts and labeled as visionaries by IT research firms such as Gartner.
  13. 13. • Content Analytics market includes key players that provide purpose-built and job- aligned offerings, including case management, composite content applications and customer communications management. Key assessment of leading players in the Content Analytics market are detailed below. Key Players Strengths Wide variety of content management and related capabilities, from content ingestion to archiving Deep analytics and business intelligence tools Weaknesses IBM's greatest strengths also poses its greatest challenge: Breadth of its products may make it hard for customers to understand where to start or how to extend their current offerings Strengths • Open Text's relationship with SAP provides a firm foundation for expansion and has enabled it to command a strong position in markets where SAP is strong. Weaknesses • Complicated architecture • High Pricing • Poor after-sales support Strengths  Extensive content management stack that includes most ECM elements  Customized industry solutions, specifically for the healthcare, life sciences, energy and engineering sectors Weaknesses • Only a limited and tactical solution in applicability Strengths • Strong product and solution capabilities • Deep focus on vertical markets, specialized solutions for healthcare and higher education sectors Weaknesses • Increasing fragmentation of its product architecture and a lack of clarity about its road map • Lack of interoperability IBM Open Text EMC Perceptive Software Strengths  Long and extensive experience in developing content-enabled applications  Solution capability for Mobile and Cloud deployment Weaknesses • Limited global footprint with 85% of sales coming from NA • Limited capabilities to manage sophisticated digital asset management requirements Hyland
  14. 14. Trends  Increased focus on social media text analytics as it is creating huge amount of unstructured data.  Large scale changes in system architecture as new data-centric model and solutions will emerge. Large data will live in persistent memory and many CPUs/clients will use shallow hierarchy  Significant benefits from Content Analytics are likely to continue for at least 5-10 years more before it reaches the “Plateau of Productivity” Future outlook for growth in the Content Analytics space will continue to remain bright as businesses continue to search for these solutions to enhance their operational efficiency and better understanding of their current and prospective customers Implications Major Trends in Content Analytics
  15. 15. Annexure
  16. 16. CONTENT ANALYTICS HOW DOES WORK AN EXAMPLE ?
  17. 17. 17 Analyzing Unstructured Content – Text Analytics Answering complex natural language questions requires more than keyword evidence This evidence suggests “Gary” is the answer BUT the system must learn that keyword matching may be weak relative to other types of evidence
  18. 18. 18 Analyzing Unstructured Content – Content Analytics CA approach leverages multiple algorithms to draw patterns and identify insights Stronger evidence can be much harder to find and score … … and the evidence is still not 100% certain Search far and wide Explore many hypotheses Find judge evidence Many inference algorithms
  19. 19. Thank You

×