The document discusses content analytics and its ability to extract insights from unstructured data. It defines content analytics as applying business intelligence practices to "Big Content" like emails, documents, social media posts, etc. Content analytics solutions can analyze unstructured content, deliver better business understanding and visibility through patterns, trends, and correlations. It brings the power of business intelligence to an organization's entire information, not just structured data. The document also discusses key benefits, challenges, industries where content analytics is used, and leading providers in the market.
3. Business Need
A large percentage (estimated at 80% or more) of the information in a company
is maintained as unstructured content, which includes valuable assets such as
emails, customer correspondence, free-form fields on applications, wikis, blobs
of text in a database, content in enterprise content repositories, social media
posts, and messages of all kinds. Because this content lacks structure, it is
difficult to search and analyze it without extensive effort and automation
4. Structured vs Unstructured Data
Column Value
Patient Joe Brown
Date of Birth 02/13/1972
Date
Admitted
02/05/2014
Structured Data
High Degree of organization,
such as a relational database
Unstructured Data
Information that is difficult to organize
using traditional mechanisms
“The patient came in complaining of
chest pain, shortness of breath, and
lingering headaches…smokes 2
packs a day… family history of heart
disease…has been experiencing
similar symptoms for the past
12 hours….”
5. Big Content
• Beyond conventional Big Data, there exists a
tsunami of information in the big data
universe that has largely remained untapped
• Big Data has morphed into a world of
unstructured machine-generated data and
human-generated content that is referred to
as ‘Big Content.’ for example, chat logs,
emails, documents, sales and service notes,
CRM case notes, support tickets, weblogs,
social media feeds, and more
Content Analytics
Content analytics is the act of applying
business intelligence and business analytics
practices to this Big Content
Companies use content analytics software to
provide visibility into the amount of content
that is being created, the nature of that
content and how it is used. This contextual
value-adding information has remained
under-used due to lack of recognition and
inadequate technologies
Big Content
6. Content Analytics approach leverages multiple algorithms to draw patterns and
identify insights from unstructured data
Content analytics solution
processes textual data in ways
that help to search, discover,
and perform the same
analytics on textual data that
is currently performed on
structured data in a business
intelligence style of
application.
With Content Analytics
Solutions, unstructured data
can be used in ways that were
only previously attainable from
structured data sets
Analyze unstructured
content1
Content Analytics delivers new
business understanding and
visibility from the content and
context of textual information. For
example, it can identify patterns,
view trends and deviations over
time, and reveal unusual
correlations or anomalies. It can
explain why events are occurring
and find new opportunities by
aggregating the voices of
customers, suppliers, and the
market.
Better business
understanding & visibility2
Tool for reporting
statistics and deriving
actionable insights.
With Content Analytics,
solutions, we can define
many facets (or aspects) of
your data, with each facet
potentially leading to
valuable insights for various
users.
Content Analytics brings the
power of business intelligence
to the entire enterprise
information, not just structured
information(which is less than
20% of the entire enterprise
repository)
3
Content Analytics Solutions
7. Text Analytics or Natural Language
Processing were a set of linguistic, statistical,
and machine learning techniques that allow
text to be analysed and key information
extraction for business Integration.
However, it gave only answer to who, what,
where and when of a subject? The why was
left to subjective assessment only
Traditional Approach – Text Analytics
Evolution of Content Analytics
Contemporary Solution – Content Analytics
• Content Analytics (Text Analytics + Mining) refers to
the text analytics process plus the ability to visually
identify and explore trends, patterns, and
statistically relevant facts found in various types of
content spread across internal and external
content sources.
• Content analytics distinctively adds the why and
the how and provides a comprehensive
understanding of the world around the subject
8. Identify meaning, trends, patterns, preferences, tastes,
from text for better business decision making
Understand the customers on a granular level primarily
due to to semantic and sentiment analysis
Extract more value from your social media community
by build a richer profile of each person on customer
database
Quickly identify trends amongst the customer base by
filtering and giving structure to the data
Reuse and curate content by analysing and curating
content from partner organisations and external sources
that are pertinent to the target market
Customer-centric marketing: As content analytics can
determine the interests of individual customers &
prospects, so, for each person the content that is most
relevant to them can be customized and personalised
propositions can be delivered
Content Analytics complements business intelligence to provide a more detailed
and accurate understanding of market and customer needs
€
Content
Analytics
1
2
3
4
5
6
Key Benefits of Content Analytics
9. • 90% of the world’s data
was created in the last two
years
• 5 million trade events per
second
Key Challenges of Content Analytics
Beyond Volume, Variety and Velocity is the Issue of Big Data Veracity
Velocity
Challenges of Content Analytics
• 1 Trillion connected
devices generate 2.5
quintillion bytes data / day
• 12 terabytes of Tweets
created daily
Volume
• With big data there is a
tendency for errors to snowball
e.g. user entry errors,
redundancy and corruption all
provide uncertainty &
ambiguity to quality of data
Veracity
• Structured, unstructured,
multimedia, text; varied
content creation
• 80% of the world’s data today
is unstructured
1
3
2
Variety4
10. Content Analytics is used in many verticals and for various applications solving
varied business needs
Note: *This is just a representative list to showcase the capabilities of content analytics and not exhaustive
Usage of Content Analytics Solutions*
Examples of Business Problems that can be
addressed
Market intelligence
Case management
Compliance
Risk scoring
“What features of our Banking Services
are most liked/hated by our customers?”
Financial
Services
Scientific discovery
Bio-surveillance
Clinical trials
Healthcare
and Life
Sciences
Digital asset management
Content mining
Contextual advertising
“What caused this recent drop in sales
for Product X?”
Media and
Advertising
Industry Solutions
Security
Intelligence
Digital library services
E-learning
Education
and Govt.
“Give me a media profile of Mr. X
including Trends, Quotes, Roles,
Contacts etc. “
“Which regulatory causes and sentences
from Past have hindered the objective of
universal education?”
12. Industry Overview
› Content Analytics solutions are usually evolutionary products
of Enterprise Content Management Solutions providers. These
solutions enable the management of business information
throughout the content lifecycle, from creation to disposition.
As a technical architecture, ECM consists of a platform or a set
of applications that interoperate but that can be sold and
used separately.
› Content Analytics and ECM market will grow from $5.1 billion
in 2013 to over $9.3 billion in 2017, at a CAGR of 16% over the
period.
›
› Leading providers of content analytics solutions are IBM,
Open text, EMC, Perceptive Software, Hyland, Microsoft and
Oracle. Several other new entrants such as Xerox, Alresco and
Newgen Software have also developed solutions which are
rated highly by industry experts and labeled as visionaries by
IT research firms such as Gartner.
13. • Content Analytics market includes key players that provide purpose-built and job-
aligned offerings, including case management, composite content applications
and customer communications management. Key assessment of leading players
in the Content Analytics market are detailed below.
Key Players
Strengths
Wide variety of
content
management and
related capabilities,
from content
ingestion to
archiving
Deep analytics and
business intelligence
tools
Weaknesses
IBM's greatest
strengths also poses
its greatest
challenge: Breadth of
its products may
make it hard for
customers to
understand where to
start or how to
extend their current
offerings
Strengths
• Open Text's
relationship with
SAP provides a
firm foundation for
expansion and
has enabled it to
command a
strong position in
markets where
SAP is strong.
Weaknesses
• Complicated
architecture
• High Pricing
• Poor after-sales
support
Strengths
Extensive content
management
stack that includes
most ECM
elements
Customized
industry solutions,
specifically for the
healthcare, life
sciences, energy
and engineering
sectors
Weaknesses
• Only a limited and
tactical solution in
applicability
Strengths
• Strong product
and solution
capabilities
• Deep focus on
vertical markets,
specialized
solutions for
healthcare and
higher education
sectors
Weaknesses
• Increasing
fragmentation of
its product
architecture and
a lack of clarity
about its road
map
• Lack of
interoperability
IBM Open Text EMC
Perceptive
Software
Strengths
Long and
extensive
experience in
developing
content-enabled
applications
Solution capability
for Mobile and
Cloud
deployment
Weaknesses
• Limited global
footprint with 85%
of sales coming
from NA
• Limited
capabilities to
manage
sophisticated
digital asset
management
requirements
Hyland
14. Trends
Increased focus on social media text
analytics as it is creating huge
amount of unstructured data.
Large scale changes in system
architecture as new data-centric
model and solutions will emerge.
Large data will live in persistent
memory and many CPUs/clients will
use shallow hierarchy
Significant benefits from Content
Analytics are likely to continue for at
least 5-10 years more before it
reaches the “Plateau of Productivity”
Future outlook for growth in
the Content Analytics space
will continue to remain bright
as businesses continue to
search for these solutions to
enhance their operational
efficiency and better
understanding of their
current and prospective
customers
Implications
Major Trends in Content Analytics
17. 17
Analyzing Unstructured Content – Text Analytics
Answering complex natural language questions requires more than keyword evidence
This evidence
suggests
“Gary” is the answer
BUT the system must
learn that keyword
matching may be
weak
relative to other
types of
evidence
18. 18
Analyzing Unstructured Content – Content Analytics
CA approach leverages multiple algorithms to draw patterns and identify insights
Stronger evidence
can be much
harder to find and
score …
… and the evidence
is still not 100%
certain
Search far and wide
Explore many
hypotheses
Find judge evidence
Many inference
algorithms