SlideShare a Scribd company logo
1 of 55
Dark Data
Alex Pongpech
Dark Data
Alex Pongpech
• Even bigger than big
And so The dark
Those who
lead the fight
Turning Dark
• Useful data may become dark data after it becomes irrelevant, as it is not
processed fast enough. This is called "perishable insights" in "live flowing
data".
• For example, geolocation of a customer, fraud detection
• According to IBM, about 60 percent of data loses its value immediately.
• IBM estimate that roughly 90 percent of data generated by sensors and
analog-to-digital conversions never get used.
• Not analysing data immediately and letting it go 'dark' can lead to
significant losse
• Not only must processed fast enough but also must act quick enough
Turning DARK
• Organizations retain dark data for a multitude
of reasons, and it is estimated that most
companies are only analyzing 1% of their data.
• A lot of dark data is unstructured, which means
that the information is in formats that may be
difficult to categorise, be read by the computer
and thus analysed.
• Often the reason that business do not analyse
their dark data is because of the amount of
resources it would take and the difficulty of
having that data analysed.
• Because storage is inexpensive, storing data is
easy. However, storing and securing the data
usually entails greater expenses (or even risk)
than the potential return profit.
Why dark data is handled the way it is?
• It is surprising because at the time of data collection, the companies
assume that the data is going to provide value. Companies invest a lot
on data collection so both monetarily and otherwise, data should be
considered important. Here are a few reasons why there is so much
of dark data
Why dark data is handled the way it is?
1. Lopsided priorities data on how the customer arrived at the
application page.
2. Disconnect among departments may not be known to other
departments. This is the way we do it here
3. Technology and tool constraints If data collection is done by
separate technologies and tools in the same organization, it may be
difficult to integrate audio file contents from call center with click
data from websites.
Shed some light on the DARK
Gartner defines dark data as
• the information assets organizations
• collect,
• process
• and store during regular business activities,
• but generally fail to use for other purposes (for example, analytics,
business relationships and direct monetizing).
• In an industrial context, dark data can include information gathered by
sensors and telematics.
• Similar to dark matter in physics, dark data often comprises most
organizations’ universe of information assets.
• Thus, organizations often retain dark data for compliance purposes only.
Storing and securing data typically incurs more expense (and sometimes
greater risk) than value.
Dark Data Example:
IP Location
• a manufacturer of soft drinks which runs a popular website might
think that, of all the data that they have, only those that are
directly relevant to the marketing and sales of their soft drink
products have any value for them. While they also store many
other data, such as the IP location of their users, they fail to see
how these "dark" data can also have value to their company.
• Yet if their data, properly cleansed to a high quality and then
analysed, reveal that 7% of the users of their website are
accessing their service from outside the country where they are
located, in spite of the fact that the product is only directly sold
to retailers within that country, these are in themselves valuable
data, for instance, to those who target ads at users of soft drinks.
• These dark data could also be seen as an opportunity to think
about marketing their product elsewhere. For instance, if 40% of
users from outside the country where the company was located
access their site from India, according to the IP location data,
while only 4% came from the European Union, it would strongly
suggest that a marketing campaign within Europe would have
considerably less chance of success than one aimed at the Indian
sub-continent.
Other Dark Data Examples:
type of device
• Other typical examples of dark data, which
most websites store, but fail to utilize the
value of, include the type of device one
accesses the Internet from, typically a
smartphone, tablet or computer; the web-
browser the Internet is being accessed
through, eg Chrome, Mozilla, Opera, Edge
or IE, among others, and even more
obscure or dark information such as the
number of times users re-set their
password, which would be useful to a
company which specializes in Internet and
password security.
Other Dark Data Examples:
Customer Feedback
• A well-known example of dark data which
goes to waste is where companies have a
feedback form which allows users to give
feedback concerning their website or
service but then they don't have the data
structures in place which allow these data
to be easily analysed, resulting in a failure
to take on board and act on their users
judgments and criticisms, whether positive
or negative (both of which have value), that
users make about their site or service.
How does data go DARK
Customer Feedback
Customer Information Systems
More dark data examples
• Customer Information
• Log Files
• Account Information
• Previous Employee Data
• Financial Statements
• Raw Survey Data
• Email Correspondences
• Notes or Presentations
• Old Versions of Relevant Documents
The dark bites
• Maybe we can use all this data later? This also explains why many
organizations are reluctant to part with dark data, even if they have
no plans to put it to work on their behalf, either in the near term or
further down the planning horizon.
• The dark can bite, organizations must also be aware that the dark
data they possess – or perhaps more chillingly,
• the dark data about them, their customers and their operations that's stored
in the cloud, outside their immediate control and management – can pose
risks to their continued business health and well-being.
Problems from the dark
• Data stored but not used cost money ( NYT says 90% of energy used
by data centers is waster)
• Stored data costs money, according to Datamation by 2020 unused
but stored can add up to $891 billion
• The more data is stored but not used, the higher the risk specially in
privacy
The risks
1. Legal and regulatory risk. If data covered by mandate or regulation
– such as confidential, financial information (credit card or other
account data) or patient records – appears anywhere in dark data
collections, its exposure could involve legal and financial liability.
2. Intelligence risk. If dark data encompasses proprietary or sensitive
information reflective of business operations, practices, competitive
advantages, important partnerships and joint ventures, and so
forth, inadvertent disclosure could adversely affect the bottom line
or compromise important business activities and relationships.
The risks
3. Reputation risk. Any kind of data breach reflects badly on the
organizations affected thereby. This applies as much to dark data
(especially in light of other risks) as to other kinds of breaches.
4. Opportunity costs. Given that the organization has decided not to invest
in analysis and mining of dark data by definition, concerted efforts by
third parties to exploit its value represent potential losses of intelligence
and value based upon its contents.
5. Open-ended exposure. By definition, dark data contains information
that's either too difficult or costly to extract to be mined, or that contains
unknown (and therefore unevaluated) sources of intelligence and
exposure to loss or harm. Dark data's secrets may be very dark and
damaging indeed, but one has no way of knowing for sure.
Mitigating Risks Posed by Dark Data
1. Know where is dark---ongoing inventory and assessment.
2. Turn dark to light ---drive ongoing research into new tools and
technologies
3. understanding where dark data resides, how it's stored, how it's
protected and what kinds of access controls help maintain its security.
4. No man land –Ubiquitous encryption.No dark data should be readily
accessible to casual inspection, under any circumstances.
5. Don’t stay in the dark too long ---Retention policies and safe disposal.
6. Auditing dark data for security purposes.
What are some other major areas in which dark
data is being underutilized besides underutilized
customer information?
• Education and Healthcare.
• The potential to service students and patients in the manner in which
the consumer and financial services pursue their target population is
huge.
• So much paperwork is involved in both education and academics, so
the data is there—and in the age of electronic health records
government incentives, much of it in the healthcare space is now
digital.
• However, it needs to be mined and analyzed in order to lead to
opportunities that effect the change which usually results from the
strategic use of personal and behavioral data.
What kind of businesses can really benefit
from dark data extraction and processing?
• Business that sells a product, service or idea—anyone who has
customers—can benefit.
• How many times
• a user resets their password IP address when a user logs into your
website/app
• Last email communication date to your customers
• Mobile handset type, or web browser version
• Free text feedback on a hotel stay or recent flight
• Additional passengers or guest names on a ticket or hotel room
• These data points or features are often overlooked by marketing
teams as serving any useful purpose, as there is a perception that this
type of information is only collected for compliance, fraud or
regulatory requirements
How old is too old when it comes to dark
data?
• Nothing is ever too old unless it is too old
• That said, if you’re analyzing, say, customer sentiment in social media,
you simply won’t have relevant data that predates the advent of
social channels. So in that case, dark data from before those channels
existed could be considered “too old.”
How can you turn dark data into active,
revenue generating data?
• This is where data science, marketing, and business intelligence need to get their
heads together to find new ways of activating dark data to provide new
opportunities for the organization. While dark data can appear dull and
uninteresting on the surface; there are methods to turn it into highly granular,
rich customer insights.
• Here are a few key steps to get you started on the above examples:
• Log-ins to your website or mobile application, what city/country are the IP
addresses? Are you logging each location a user visit and creating a virtual map of
their travels? This is particularly compelling when creating a 360-degree view of
your customer.
• Additional passengers/guest names on a reservation. Not only does this give
insight to homophily of the user and fuel your social network graph of which
users are centrally connected and influential, but it also provides rich insight into
their family and workplace. Link this data with social graphing, and you’ll quickly
obtain age, gender, and behavioral traits.
How can you turn dark data into active,
revenue generating data?
• Mobile phone data. This simple piece of data will illuminate an array of
new product and marketing opportunities, and provide an additional
segmentation layer to improve marketing effectiveness. From mobile
phone data, it’s possible to know which telco partners you should bring on
board (which will activate even MORE opportunities), you’ll know where
your users are in the world, in real time if they have recently purchased
tickets with another airline, and more.
• Free text input, such as feedback can be passed through cognitive text
analysis tools to determine if the general sentiment of the feedback is
positive of negative. Linking the user profile to your internal database can
also determine if this user is sending mixed messages on social media
compared to surveys and feedback forms. THINK AIRLINE
Four Ways to Use Dark Data
1. Networking machine data. As noted above, servers, firewalls, network
monitoring tools and other parts of your environment generate large
amounts of machine data related to network operations. Avoid dark
networking data by using this information to analyze network security, as
well as to monitor network activity patterns to ensure that your network
infrastructure is never under- or over-utilized.
2. Customer support logs. Most businesses maintain records of customer-
support interactions that include information such as when a customer
contacted the business, which type of communication channel was used,
how long the engagement lasted and so on. Don’t make the mistake of
leaving this data in the dark, or using it only when you need to research a
customer issue. Instead, build it into your analytics workflows by
leveraging it to help understand when your customers are most likely to
contact you, what their preferred methods of contact are and so on.
Four Ways to Use Dark Data
3. “Legacy” system log. If you have mainframes or other older types of
systems running in your environment, you may think that there is no way
to use modern analytics tools to understand them. But you can. By
offloading system logs and other data from these systems into an
analytics platform like Hadoop, you can make sure you are not leaving
this “legacy” data in the dark.
4. Non-textual data. Most data analytics workflows are built around textual
data, which is easier to ingest. You can also make use of video, audio or
other non-textual files, however. You can analyze the meta data
associated with them, or, if appropriate, translate speech to text in order
to gain more insight into the content of the data itself. The effort
required in this regard may not be worth it in all cases, but the bigger
point worth keeping in mind is that your non-textual data doesn’t have
to be dark data. There are ways to make it actionable if you need it to be.
LET THERE BE LIGHT: Dark Data Analytics
• Dark analytics efforts typically focus on three dimensions:
1. Untapped data already in your possession
2. Nontraditional unstructured data
3. Data in the deep web
• o be clear, the purpose of dark analytics is not to catalog vast volumes of
unstructured data. Casting a broader data net without a specific purpose in
mind will likely lead to failure. Indeed, dark analytics efforts that are
surgically precise in both intent and scope often deliver the greatest value.
Like every analytics journey, successful efforts begin with a series of
specific questions. What problem are you solving? What would we do
differently if we could solve that problem? Finally, what data sources and
analytics capabilities will help us answer the first two questions?
DeepDive
• http://deepdive.stanford.edu/quickstart
• DeepDive is a system to extract value from dark data. Like dark matter, dark data
is the great mass of data buried in text, tables, figures, and images, which lacks
structure and so is essentially unprocessable by existing software.
• DeepDive helps bring dark data to light by creating structured data (SQL tables)
from unstructured information (text documents) and integrating such data with
an existing structured database.
• DeepDive is used to extract sophisticated relationships between entities and
make inferences about facts involving those entities.
• DeepDive helps one process a wide variety of dark data and put the results into a
database. With the data in a database, one can use a variety of standard tools
that consume structured data; e.g., visualization tools like Tablaeu or analytics
tools like Excel.
• http://deepdive.stanford.edu/showcase/apps
Lessons from the front lines
• IU HEALTH’S RX FOR MINING DARK DATA
• Retailers make it personal
• Oil Company
IU HEALTH’S RX FOR MINING DARK DATA
• As part of a new model of care, Indiana
University Health (IU Health) is exploring
ways to use nontraditional and unstructured
data to personalize health care for
individual patients and improve overall
health outcomes for the broader
population.
• Traditional relationships between medical
care providers and patients are often
transactional in nature, focusing on
individual visits and specific outcomes
rather than providing holistic care services
on an ongoing basis. IU Health has
determined that incorporating insights from
additional data will help build patient
loyalty and provide more useful, seamless,
and cost-efficient care.
IU HEALTH’S RX FOR MINING DARK DATA
• “IU Health needs a 360-degree understanding of the
patients it serves in order to create the kind of care and
services that will keep them in the system
• For example, consider the voluminous free-form notes—
both written and verbal—that physicians generate
during patient consultations.
• Deploying voice recognition, deep learning, and text
analysis capabilities to these in-hand but previously
underutilized sources could potentially add more depth
and detail to patient medical records.
• These same capabilities might also be used to analyze
audio recordings of patient conversations with IU Health
call centers to further enhance a patient’s records. Such
insights could help IU Health develop a more thorough
understanding of the patient’s needs, and better
illuminate how those patients utilize the health system’s
services.
IU HEALTH’S RX FOR MINING DARK DATA
• Another opportunity involves using dark data to help predict need and manage care
across populations. IU Health is examining how cognitive computing, external data, and
patient data could help identify patterns of illness, health care access, and historical
outcomes in local populations. The approaches could make it possible to incorporate
socioeconomic factors that may affect patients’ engagement with health care providers.
• “There may be a correlation between high density per living unit and disengagement
from health,” says Mark Lantzy, senior vice president and chief information officer, IU
Health. “It is promising that we can augment patient data with external data to
determine how to better engage with people about their health. We are creating the
underlying platform to uncover those correlations and are trying to create something
more systemic.
• The destination for our journey is an improved patient experience,” he continues.
“Ultimately, we want it to drive better satisfaction and engagement. More than deliver
great health care to individual patients, we want to improve population health
throughout Indiana as well. To be able to impact that in some way, even incrementally,
would be hugely beneficial.”
Retailers make it personal
• Retailers almost universally recognize that digital has reshaped customer
behavior and shopping. In fact, $0.56 of every dollar spent in a store is
influenced by a digital interaction.
• Yet many retailers—particularly those with brick-and-mortar operations—
still struggle to deliver the digital experiences customers expect. Some
focus excessively on their competitors instead of their customers and rely
on the same old key performance indicators and data.
• In recent years, however, growing numbers of retailers have begun
exploring different approaches to developing digital experiences. Some are
analyzing previously dark data culled from customers’ digital lives and
using the resulting insights to develop merchandising, marketing, customer
service, and even product development strategies that offer shoppers a
targeted and individualized customer experience.
Retailers make it personal
• Stitch Fix, for example, is an online subscription shopping service that
uses images from social media and other sources to track emerging
fashion trends and evolving customer preferences.
• Its process begins with clients answering a detailed questionnaire
about their tastes in clothing. Then, with client permission, the
company’s team of 60 data scientists augments that information by
scanning images on customers’ Pinterest boards and other social
media sites, analyzing them, and using the resulting insights to a
develop a deeper understanding of each customer’s sense of style.
• Company stylists and artificial intelligence algorithms use these
profiles to select style-appropriate items of clothing to be shipped to
individual customers at regular intervals.
Retailers make it personal
• Meanwhile, grocery supermarket chain Kroger Co. is taking a different
approach that leverages Internet of Things and advanced analytics
techniques. As part of a pilot program, the company is embedding a
network of sensors and analytics into store shelves that can interact
with the Kroger app and a digital shopping list on a customer’s phone.
• As the customer strolls down each aisle, the system—which contains
a digital history of the customer’s purchases and product
preferences—can spotlight specially priced products the customer
may want on 4-inch displays mounted in the aisles. This pilot, which
began in late 2016 with initial testing in 14 stores, is expected to
expand in 2017.
GREG POWERS, VICE PRESIDENT OF TECHNOLOGY,
HALLIBURTON
• Yet the sheer volume of information that we can and do collect goes way
beyond human cognitive bandwidth. Advances in sensor science are
delivering enormous troves of both dark data and what I think of as really
dark data.
• For example, we scan rocks electromagnetically to determine their
consistency. We use nuclear magnetic resonance to perform what amounts
to an MRI on oil wells. Neutron and gamma-ray analysis measures the
electrical permittivity and conductivity of rock. Downhole spectroscopy
measures fluids. Acoustic sensors collect 1–2 terabytes of data daily.
• All of this dark data helps us better understand in-well performance. In
fact, there’s so much potential value buried in this darkness that I flip the
frame and refer to it as “bright data” that we have yet to tap.
GREG POWERS, VICE PRESIDENT OF TECHNOLOGY,
HALLIBURTON
• In the next phase of Halliburton’s ongoing analytics program, we want to develop
the capacity to capture, mine, and use bright data insights to become more
predictive.
• Given the nature of our operations, this will be no small task. Identical events
driven by common circumstances are rare in the oil and gas industry. We have 30
years of retrospective data, but there are an infinite number of combinations of
rock, gas, oil, and other variables that affect outcomes.
• Unfortunately, there is no overarching constituent physics equation that can
describe the right action to take for any situation encountered. Yet, even if we
can’t explain what we’ve seen historically, we can explore what has happened
and let our refined appreciation of historic data serve as a road map to where we
can go.
• In other words, we plan to correlate data to things that statistically seem to
matter and, then, use this data to develop a confidence threshold to inform how
we should approach these issues.
GREG POWERS, VICE PRESIDENT OF TECHNOLOGY,
HALLIBURTON
• We believe that nontraditional data holds the key to creating advanced intelligent
response capabilities to solve problems, potentially without human intervention, before
they happen.
• At the lowest level, we’ll take measurements and tell someone after the fact that
something happened. At the next level, our goal will be to recognize that something has
happened and, then, understand why it happened. The following step will use real-time
monitoring to provide in-the-moment awareness of what is taking place and why. In the
next tier, predictive tools will help us discern what’s likely to happen next. The most
extreme offering will involve automating the response—removing human intervention
from the equation entirely.
• Drilling is complicated work. To make it more autonomous and efficient, and to free
humans from mundane decision making, we need to work smarter. Our industry is facing
a looming generational change. Experienced employees will soon retire and take with
them decades of hard-won expertise and knowledge. We can’t just tell our new hires,
“Hey, go read 300 terabytes of dark data to get up to speed.” We’re going to have to rely
on new approaches for developing, managing, and sharing data-driven wisdom.
Where do you start?
Ask the right questions:
• Rather than attempting to discover and inventory all of the dark data
hidden within and outside your organization, work with business teams to
identify specific questions they want answered. Work to identify potential
dark analytics sources and the untapped opportunities contained therein.
• Then focus your analytics efforts on those data streams and sources that
are particularly relevant.
• For example, if marketing wants to boost sales of sports equipment in a
certain region, analytics teams can focus their efforts on real-time sales
transaction streams, inventory, and product pricing data at select stores
within the target region. They could then supplement this data with
historic unstructured data—in-store video analysis of customer foot traffic,
social sentiment, influencer behavior, or even pictures of displays or
product placement across sites—to generate more nuanced insights.
Look outside your organization:
• You can augment your own data with publicly available demographic,
location, and statistical information. Not only can this help your
analytics teams generate more expansive, detailed reports—it can put
insights in a more useful context.
• For example, a physician makes recommendations to an asthma
patient based on her known health history and a current examination.
By reviewing local weather data, he can also provide short-term
solutions to help her through a flare-up during pollen season. In
another example, employers might analyze data from geospatial
tools, traffic patterns, and employee turnover to determine the
extent to which employee job satisfaction levels are being adversely
impacted by commute times.
Augment data talent:
• Data scientists are an increasingly valuable resource, especially those who
can artfully combine deep modeling and statistical techniques with
industry or function-specific insights and creative problem framing. Going
forward, those with demonstrable expertise in a few areas will likely be in
demand.
• For example, both machine learning and deep learning require
programmatic expertise—the ability to build established patterns to
determine the appropriate combination of data corpus and method to
uncover reasonable, defensible insights. Likewise, visual and graphic design
skills may be increasingly critical given that visually communicating results
and explaining rationales are essential for broad organizational adoption.
• Finally, traditional skills such as master data management and data
architecture will be as valuable as ever—particularly as more companies
begin laying the foundations they’ll need to meet the diverse, expansive,
and exploding data needs of tomorrow.
Explore advanced visualization tools:
• Not everyone in your organization will be able to digest a printout of
advanced Bayesian statistics and apply them to business practices.
• Most people need to understand the “so what” and the “why” of complex
analytical insights before they can turn insight into action. In many
situations, information can be more easily digested when presented as an
infographic, a dashboard, or another type of visual representation.
• Visual and design software packages can do more than generate eye-
catching graphics such as bubble charts, word clouds, and heat maps—they
can boost business intelligence by repackaging big data into smaller, more
meaningful chunks, delivering value to users much faster. Additionally, the
insights (and the tools) can be made accessible across the enterprise,
beyond the IT department, and to business users at all levels, to create
more agile, cross-functional teams.
View it as a business-driven effort:
• It’s time to recognize analytics as an overall business strategy rather than
as an IT function. To that end, work with C-suite colleagues to garner
support for your dark analytics approach.
• Many CEOs are making data a cornerstone of overall business strategy,
which mandates more sophisticated techniques and accountability for
more deliberate handling of the underlying assets.
• By understanding your organization’s agenda and goals, you can determine
the value that must be delivered, define the questions that should be
asked, and decide how to harness available data to generate answers.
• Data analytics then becomes an insight-driven advantage in the
marketplace. The best way to help ensure buy-in is to first pilot a project
that will demonstrate the tangible ROI that can be realized by the
organization with a businesswide analytics strategy.
Think broadly:
• As you develop new capabilities and strategies, think about how you
can extend them across the organization as well as to customers,
vendors, and business partners. Your new data strategy becomes part
of your reference architecture that others can use.
Thank you
and
May the light shines on you
References
1. http://www.kdnuggets.com/2015/11/importance-dark-data-big-data-
world.html
2. http://www.kdnuggets.com/2015/01/shining-light-on-dark-data.html
3. http://www.kdnuggets.com/2016/03/rise-dark-data-how-
harnessed.html
4. http://www.kdnuggets.com/solutions/fraud-detection.html
5. https://en.wikipedia.org/wiki/Operational_database
6. http://blog.syncsort.com/2017/05/big-data/4-dark-data-examples-use-
cases/
7. Tracie Kambies, Paul Roma, Nitin Mittal, Sandeep Kumar Sharma,
https://dupress.deloitte.com/dup-us-en/focus/tech-trends/2017/dark-
data-analyzing-unstructured-data.html
Alex all about data

More Related Content

What's hot

Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data DATAVERSITY
 
Thwart Fraud Using Graph-Enhanced Machine Learning and AI
Thwart Fraud Using Graph-Enhanced Machine Learning and AIThwart Fraud Using Graph-Enhanced Machine Learning and AI
Thwart Fraud Using Graph-Enhanced Machine Learning and AINeo4j
 
Big Data Decision-Making
Big Data Decision-MakingBig Data Decision-Making
Big Data Decision-MakingTeradata Aster
 
Tim Estes - Information Systems in an Entity Centric World
Tim Estes - Information Systems in an Entity Centric WorldTim Estes - Information Systems in an Entity Centric World
Tim Estes - Information Systems in an Entity Centric WorldDigital Reasoning
 
Applications of Machine Learning at USC
Applications of Machine Learning at USCApplications of Machine Learning at USC
Applications of Machine Learning at USCSri Ambati
 
Big Data : Risks and Opportunities
Big Data : Risks and OpportunitiesBig Data : Risks and Opportunities
Big Data : Risks and OpportunitiesKenny Huang Ph.D.
 
Got Chaos? Extracting Business Intelligence from Email with Natural Language ...
Got Chaos? Extracting Business Intelligence from Email with Natural Language ...Got Chaos? Extracting Business Intelligence from Email with Natural Language ...
Got Chaos? Extracting Business Intelligence from Email with Natural Language ...Digital Reasoning
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big DataIndu Khemchandani
 
Semantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextSemantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextMurad Daryousse
 
Mining the Social Web for Fun & Profit Within Your Organization
Mining the Social Web for Fun & Profit Within Your OrganizationMining the Social Web for Fun & Profit Within Your Organization
Mining the Social Web for Fun & Profit Within Your OrganizationDigital Reasoning
 
Big Data - Insights & Challenges
Big Data - Insights & ChallengesBig Data - Insights & Challenges
Big Data - Insights & ChallengesRupen Momaya
 
Identity Fraud Protection Using Big Data Analytics - StampedeCon 2015
Identity Fraud Protection Using Big Data Analytics - StampedeCon 2015Identity Fraud Protection Using Big Data Analytics - StampedeCon 2015
Identity Fraud Protection Using Big Data Analytics - StampedeCon 2015StampedeCon
 
ACEDS - ZyLAB webinar - AI Based eDiscovery Analytics
ACEDS - ZyLAB webinar - AI Based eDiscovery AnalyticsACEDS - ZyLAB webinar - AI Based eDiscovery Analytics
ACEDS - ZyLAB webinar - AI Based eDiscovery AnalyticsAnnelore van der Lint
 
How new ai based analytics ignite a productivity revolution in e discovery-final
How new ai based analytics ignite a productivity revolution in e discovery-finalHow new ai based analytics ignite a productivity revolution in e discovery-final
How new ai based analytics ignite a productivity revolution in e discovery-finaljcscholtes
 
Big dataplatform operationalstrategy
Big dataplatform operationalstrategyBig dataplatform operationalstrategy
Big dataplatform operationalstrategyHimanshu Bari
 
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Connected Data World
 

What's hot (20)

Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data
 
Thwart Fraud Using Graph-Enhanced Machine Learning and AI
Thwart Fraud Using Graph-Enhanced Machine Learning and AIThwart Fraud Using Graph-Enhanced Machine Learning and AI
Thwart Fraud Using Graph-Enhanced Machine Learning and AI
 
Big Data Decision-Making
Big Data Decision-MakingBig Data Decision-Making
Big Data Decision-Making
 
Tim Estes - Information Systems in an Entity Centric World
Tim Estes - Information Systems in an Entity Centric WorldTim Estes - Information Systems in an Entity Centric World
Tim Estes - Information Systems in an Entity Centric World
 
Applications of Machine Learning at USC
Applications of Machine Learning at USCApplications of Machine Learning at USC
Applications of Machine Learning at USC
 
Big Data : Risks and Opportunities
Big Data : Risks and OpportunitiesBig Data : Risks and Opportunities
Big Data : Risks and Opportunities
 
Got Chaos? Extracting Business Intelligence from Email with Natural Language ...
Got Chaos? Extracting Business Intelligence from Email with Natural Language ...Got Chaos? Extracting Business Intelligence from Email with Natural Language ...
Got Chaos? Extracting Business Intelligence from Email with Natural Language ...
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
Semantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextSemantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data Context
 
Mining the Social Web for Fun & Profit Within Your Organization
Mining the Social Web for Fun & Profit Within Your OrganizationMining the Social Web for Fun & Profit Within Your Organization
Mining the Social Web for Fun & Profit Within Your Organization
 
Analytics3.0 e book
Analytics3.0 e bookAnalytics3.0 e book
Analytics3.0 e book
 
Big Data - Insights & Challenges
Big Data - Insights & ChallengesBig Data - Insights & Challenges
Big Data - Insights & Challenges
 
Fraud and Risk in Big Data
Fraud and Risk in Big DataFraud and Risk in Big Data
Fraud and Risk in Big Data
 
Identity Fraud Protection Using Big Data Analytics - StampedeCon 2015
Identity Fraud Protection Using Big Data Analytics - StampedeCon 2015Identity Fraud Protection Using Big Data Analytics - StampedeCon 2015
Identity Fraud Protection Using Big Data Analytics - StampedeCon 2015
 
ACEDS - ZyLAB webinar - AI Based eDiscovery Analytics
ACEDS - ZyLAB webinar - AI Based eDiscovery AnalyticsACEDS - ZyLAB webinar - AI Based eDiscovery Analytics
ACEDS - ZyLAB webinar - AI Based eDiscovery Analytics
 
Big data 101
Big data 101Big data 101
Big data 101
 
How new ai based analytics ignite a productivity revolution in e discovery-final
How new ai based analytics ignite a productivity revolution in e discovery-finalHow new ai based analytics ignite a productivity revolution in e discovery-final
How new ai based analytics ignite a productivity revolution in e discovery-final
 
Big data
Big dataBig data
Big data
 
Big dataplatform operationalstrategy
Big dataplatform operationalstrategyBig dataplatform operationalstrategy
Big dataplatform operationalstrategy
 
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
 

Similar to Dark data by Worapol Alex Pongpech

Dark Data Revelation and its Potential Benefits
Dark Data Revelation and its Potential BenefitsDark Data Revelation and its Potential Benefits
Dark Data Revelation and its Potential BenefitsPromptCloud
 
Master Data in the Cloud: 5 Security Fundamentals
Master Data in the Cloud: 5 Security FundamentalsMaster Data in the Cloud: 5 Security Fundamentals
Master Data in the Cloud: 5 Security FundamentalsSarah Fane
 
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on TrackYour AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on TrackPrecisely
 
Putting data science into perspective
Putting data science into perspectivePutting data science into perspective
Putting data science into perspectiveSravan Ankaraju
 
Impact of data science in financial reporting
Impact of data science in financial reporting Impact of data science in financial reporting
Impact of data science in financial reporting James Deiotte
 
Piwik PRO The Real Cost of Data Privacy
Piwik PRO The Real Cost of Data Privacy Piwik PRO The Real Cost of Data Privacy
Piwik PRO The Real Cost of Data Privacy Piwik PRO
 
Hidden security and privacy consequences around mobility (Infosec 2013)
Hidden security and privacy consequences around mobility (Infosec 2013)Hidden security and privacy consequences around mobility (Infosec 2013)
Hidden security and privacy consequences around mobility (Infosec 2013)Huntsman Security
 
Module 2 - Improving current business with your own data - Online
Module 2 - Improving current business with your own data - Online Module 2 - Improving current business with your own data - Online
Module 2 - Improving current business with your own data - Online caniceconsulting
 
Anatomy Of A Breach: The Good, The Bad & The Ugly
Anatomy Of A Breach: The Good, The Bad & The UglyAnatomy Of A Breach: The Good, The Bad & The Ugly
Anatomy Of A Breach: The Good, The Bad & The UglyResilient Systems
 
Lead generation and data retention-What should you know as an IT manager?
Lead generation and data retention-What should you know as an IT manager?Lead generation and data retention-What should you know as an IT manager?
Lead generation and data retention-What should you know as an IT manager?Monomit Bhowmik
 
2014 ota databreach3
2014 ota databreach32014 ota databreach3
2014 ota databreach3Meg Weber
 
Where security and privacy meet partnering tips for CSOs and privacy/complian...
Where security and privacy meet partnering tips for CSOs and privacy/complian...Where security and privacy meet partnering tips for CSOs and privacy/complian...
Where security and privacy meet partnering tips for CSOs and privacy/complian...Compliancy Group
 
DATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docx
DATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docxDATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docx
DATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docxSteveNgigi2
 
Tackling the ticking time bomb – Data Migration and the hidden risks
Tackling the ticking time bomb – Data Migration and the hidden risksTackling the ticking time bomb – Data Migration and the hidden risks
Tackling the ticking time bomb – Data Migration and the hidden risksHarley Capewell
 
Big data analytics for life insurers
Big data analytics for life insurersBig data analytics for life insurers
Big data analytics for life insurersdipak sahoo
 
Big_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedBig_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedShradha Verma
 

Similar to Dark data by Worapol Alex Pongpech (20)

Dark Data Revelation and its Potential Benefits
Dark Data Revelation and its Potential BenefitsDark Data Revelation and its Potential Benefits
Dark Data Revelation and its Potential Benefits
 
Master Data in the Cloud: 5 Security Fundamentals
Master Data in the Cloud: 5 Security FundamentalsMaster Data in the Cloud: 5 Security Fundamentals
Master Data in the Cloud: 5 Security Fundamentals
 
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on TrackYour AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
 
Putting data science into perspective
Putting data science into perspectivePutting data science into perspective
Putting data science into perspective
 
Impact of data science in financial reporting
Impact of data science in financial reporting Impact of data science in financial reporting
Impact of data science in financial reporting
 
Piwik PRO The Real Cost of Data Privacy
Piwik PRO The Real Cost of Data Privacy Piwik PRO The Real Cost of Data Privacy
Piwik PRO The Real Cost of Data Privacy
 
Hidden security and privacy consequences around mobility (Infosec 2013)
Hidden security and privacy consequences around mobility (Infosec 2013)Hidden security and privacy consequences around mobility (Infosec 2013)
Hidden security and privacy consequences around mobility (Infosec 2013)
 
ii mca juno
ii mca junoii mca juno
ii mca juno
 
Module 2 - Improving current business with your own data - Online
Module 2 - Improving current business with your own data - Online Module 2 - Improving current business with your own data - Online
Module 2 - Improving current business with your own data - Online
 
Anatomy Of A Breach: The Good, The Bad & The Ugly
Anatomy Of A Breach: The Good, The Bad & The UglyAnatomy Of A Breach: The Good, The Bad & The Ugly
Anatomy Of A Breach: The Good, The Bad & The Ugly
 
A data-centric program
A data-centric program A data-centric program
A data-centric program
 
Sensitive Data Assesment
Sensitive Data AssesmentSensitive Data Assesment
Sensitive Data Assesment
 
Lead generation and data retention-What should you know as an IT manager?
Lead generation and data retention-What should you know as an IT manager?Lead generation and data retention-What should you know as an IT manager?
Lead generation and data retention-What should you know as an IT manager?
 
BREACHED: Data Centric Security for SAP
BREACHED: Data Centric Security for SAPBREACHED: Data Centric Security for SAP
BREACHED: Data Centric Security for SAP
 
2014 ota databreach3
2014 ota databreach32014 ota databreach3
2014 ota databreach3
 
Where security and privacy meet partnering tips for CSOs and privacy/complian...
Where security and privacy meet partnering tips for CSOs and privacy/complian...Where security and privacy meet partnering tips for CSOs and privacy/complian...
Where security and privacy meet partnering tips for CSOs and privacy/complian...
 
DATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docx
DATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docxDATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docx
DATA PROTECTION IMPACT ASSESSMENT TEMPLATE (ODPC).docx
 
Tackling the ticking time bomb – Data Migration and the hidden risks
Tackling the ticking time bomb – Data Migration and the hidden risksTackling the ticking time bomb – Data Migration and the hidden risks
Tackling the ticking time bomb – Data Migration and the hidden risks
 
Big data analytics for life insurers
Big data analytics for life insurersBig data analytics for life insurers
Big data analytics for life insurers
 
Big_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedBig_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_published
 

More from BAINIDA

Mixed methods in social and behavioral sciences
Mixed methods in social and behavioral sciencesMixed methods in social and behavioral sciences
Mixed methods in social and behavioral sciencesBAINIDA
 
Advanced quantitative research methods in political science and pa
Advanced quantitative  research methods in political science and paAdvanced quantitative  research methods in political science and pa
Advanced quantitative research methods in political science and paBAINIDA
 
Latest thailand election2019report
Latest thailand election2019reportLatest thailand election2019report
Latest thailand election2019reportBAINIDA
 
Data science in medicine
Data science in medicineData science in medicine
Data science in medicineBAINIDA
 
Nursing data science
Nursing data scienceNursing data science
Nursing data scienceBAINIDA
 
Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...
Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...
Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...BAINIDA
 
Statistics and big data for justice and fairness
Statistics and big data for justice and fairnessStatistics and big data for justice and fairness
Statistics and big data for justice and fairnessBAINIDA
 
Data science and big data for business and industrial application
Data science and big data  for business and industrial applicationData science and big data  for business and industrial application
Data science and big data for business and industrial applicationBAINIDA
 
Update trend: Free digital marketing metrics for start-up
Update trend: Free digital marketing metrics for start-upUpdate trend: Free digital marketing metrics for start-up
Update trend: Free digital marketing metrics for start-upBAINIDA
 
Advent of ds and stat adjustment
Advent of ds and stat adjustmentAdvent of ds and stat adjustment
Advent of ds and stat adjustmentBAINIDA
 
เมื่อ Data Science เข้ามา สถิติศาสตร์จะปรับตัวอย่างไร
เมื่อ Data Science เข้ามา สถิติศาสตร์จะปรับตัวอย่างไร เมื่อ Data Science เข้ามา สถิติศาสตร์จะปรับตัวอย่างไร
เมื่อ Data Science เข้ามา สถิติศาสตร์จะปรับตัวอย่างไร BAINIDA
 
Data visualization. map
Data visualization. map Data visualization. map
Data visualization. map BAINIDA
 
Deepcut Thai word Segmentation @ NIDA
Deepcut Thai word Segmentation @ NIDADeepcut Thai word Segmentation @ NIDA
Deepcut Thai word Segmentation @ NIDABAINIDA
 
Professionals and wanna be in Business Analytics and Data Science
Professionals and wanna be in Business Analytics and Data ScienceProfessionals and wanna be in Business Analytics and Data Science
Professionals and wanna be in Business Analytics and Data ScienceBAINIDA
 
Deep learning and image analytics using Python by Dr Sanparit
Deep learning and image analytics using Python by Dr SanparitDeep learning and image analytics using Python by Dr Sanparit
Deep learning and image analytics using Python by Dr SanparitBAINIDA
 
Visualizing for impact final
Visualizing for impact finalVisualizing for impact final
Visualizing for impact finalBAINIDA
 
Python programming workshop
Python programming workshopPython programming workshop
Python programming workshopBAINIDA
 
Second prize business plan @ the First NIDA business analytics and data scien...
Second prize business plan @ the First NIDA business analytics and data scien...Second prize business plan @ the First NIDA business analytics and data scien...
Second prize business plan @ the First NIDA business analytics and data scien...BAINIDA
 
Second prize data analysis @ the First NIDA business analytics and data scie...
Second prize data analysis @ the First NIDA  business analytics and data scie...Second prize data analysis @ the First NIDA  business analytics and data scie...
Second prize data analysis @ the First NIDA business analytics and data scie...BAINIDA
 
แผนธุรกิจ ของทีมที่ได้รางวัลชนะเลิศ The First NIDA Business Analytics and Dat...
แผนธุรกิจ ของทีมที่ได้รางวัลชนะเลิศ The First NIDA Business Analytics and Dat...แผนธุรกิจ ของทีมที่ได้รางวัลชนะเลิศ The First NIDA Business Analytics and Dat...
แผนธุรกิจ ของทีมที่ได้รางวัลชนะเลิศ The First NIDA Business Analytics and Dat...BAINIDA
 

More from BAINIDA (20)

Mixed methods in social and behavioral sciences
Mixed methods in social and behavioral sciencesMixed methods in social and behavioral sciences
Mixed methods in social and behavioral sciences
 
Advanced quantitative research methods in political science and pa
Advanced quantitative  research methods in political science and paAdvanced quantitative  research methods in political science and pa
Advanced quantitative research methods in political science and pa
 
Latest thailand election2019report
Latest thailand election2019reportLatest thailand election2019report
Latest thailand election2019report
 
Data science in medicine
Data science in medicineData science in medicine
Data science in medicine
 
Nursing data science
Nursing data scienceNursing data science
Nursing data science
 
Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...
Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...
Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...
 
Statistics and big data for justice and fairness
Statistics and big data for justice and fairnessStatistics and big data for justice and fairness
Statistics and big data for justice and fairness
 
Data science and big data for business and industrial application
Data science and big data  for business and industrial applicationData science and big data  for business and industrial application
Data science and big data for business and industrial application
 
Update trend: Free digital marketing metrics for start-up
Update trend: Free digital marketing metrics for start-upUpdate trend: Free digital marketing metrics for start-up
Update trend: Free digital marketing metrics for start-up
 
Advent of ds and stat adjustment
Advent of ds and stat adjustmentAdvent of ds and stat adjustment
Advent of ds and stat adjustment
 
เมื่อ Data Science เข้ามา สถิติศาสตร์จะปรับตัวอย่างไร
เมื่อ Data Science เข้ามา สถิติศาสตร์จะปรับตัวอย่างไร เมื่อ Data Science เข้ามา สถิติศาสตร์จะปรับตัวอย่างไร
เมื่อ Data Science เข้ามา สถิติศาสตร์จะปรับตัวอย่างไร
 
Data visualization. map
Data visualization. map Data visualization. map
Data visualization. map
 
Deepcut Thai word Segmentation @ NIDA
Deepcut Thai word Segmentation @ NIDADeepcut Thai word Segmentation @ NIDA
Deepcut Thai word Segmentation @ NIDA
 
Professionals and wanna be in Business Analytics and Data Science
Professionals and wanna be in Business Analytics and Data ScienceProfessionals and wanna be in Business Analytics and Data Science
Professionals and wanna be in Business Analytics and Data Science
 
Deep learning and image analytics using Python by Dr Sanparit
Deep learning and image analytics using Python by Dr SanparitDeep learning and image analytics using Python by Dr Sanparit
Deep learning and image analytics using Python by Dr Sanparit
 
Visualizing for impact final
Visualizing for impact finalVisualizing for impact final
Visualizing for impact final
 
Python programming workshop
Python programming workshopPython programming workshop
Python programming workshop
 
Second prize business plan @ the First NIDA business analytics and data scien...
Second prize business plan @ the First NIDA business analytics and data scien...Second prize business plan @ the First NIDA business analytics and data scien...
Second prize business plan @ the First NIDA business analytics and data scien...
 
Second prize data analysis @ the First NIDA business analytics and data scie...
Second prize data analysis @ the First NIDA  business analytics and data scie...Second prize data analysis @ the First NIDA  business analytics and data scie...
Second prize data analysis @ the First NIDA business analytics and data scie...
 
แผนธุรกิจ ของทีมที่ได้รางวัลชนะเลิศ The First NIDA Business Analytics and Dat...
แผนธุรกิจ ของทีมที่ได้รางวัลชนะเลิศ The First NIDA Business Analytics and Dat...แผนธุรกิจ ของทีมที่ได้รางวัลชนะเลิศ The First NIDA Business Analytics and Dat...
แผนธุรกิจ ของทีมที่ได้รางวัลชนะเลิศ The First NIDA Business Analytics and Dat...
 

Recently uploaded

Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 

Recently uploaded (20)

Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 

Dark data by Worapol Alex Pongpech

  • 2. Dark Data Alex Pongpech • Even bigger than big
  • 3. And so The dark
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11. Turning Dark • Useful data may become dark data after it becomes irrelevant, as it is not processed fast enough. This is called "perishable insights" in "live flowing data". • For example, geolocation of a customer, fraud detection • According to IBM, about 60 percent of data loses its value immediately. • IBM estimate that roughly 90 percent of data generated by sensors and analog-to-digital conversions never get used. • Not analysing data immediately and letting it go 'dark' can lead to significant losse • Not only must processed fast enough but also must act quick enough
  • 12. Turning DARK • Organizations retain dark data for a multitude of reasons, and it is estimated that most companies are only analyzing 1% of their data. • A lot of dark data is unstructured, which means that the information is in formats that may be difficult to categorise, be read by the computer and thus analysed. • Often the reason that business do not analyse their dark data is because of the amount of resources it would take and the difficulty of having that data analysed. • Because storage is inexpensive, storing data is easy. However, storing and securing the data usually entails greater expenses (or even risk) than the potential return profit.
  • 13. Why dark data is handled the way it is? • It is surprising because at the time of data collection, the companies assume that the data is going to provide value. Companies invest a lot on data collection so both monetarily and otherwise, data should be considered important. Here are a few reasons why there is so much of dark data
  • 14. Why dark data is handled the way it is? 1. Lopsided priorities data on how the customer arrived at the application page. 2. Disconnect among departments may not be known to other departments. This is the way we do it here 3. Technology and tool constraints If data collection is done by separate technologies and tools in the same organization, it may be difficult to integrate audio file contents from call center with click data from websites.
  • 15. Shed some light on the DARK Gartner defines dark data as • the information assets organizations • collect, • process • and store during regular business activities, • but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). • In an industrial context, dark data can include information gathered by sensors and telematics. • Similar to dark matter in physics, dark data often comprises most organizations’ universe of information assets. • Thus, organizations often retain dark data for compliance purposes only. Storing and securing data typically incurs more expense (and sometimes greater risk) than value.
  • 16. Dark Data Example: IP Location • a manufacturer of soft drinks which runs a popular website might think that, of all the data that they have, only those that are directly relevant to the marketing and sales of their soft drink products have any value for them. While they also store many other data, such as the IP location of their users, they fail to see how these "dark" data can also have value to their company. • Yet if their data, properly cleansed to a high quality and then analysed, reveal that 7% of the users of their website are accessing their service from outside the country where they are located, in spite of the fact that the product is only directly sold to retailers within that country, these are in themselves valuable data, for instance, to those who target ads at users of soft drinks. • These dark data could also be seen as an opportunity to think about marketing their product elsewhere. For instance, if 40% of users from outside the country where the company was located access their site from India, according to the IP location data, while only 4% came from the European Union, it would strongly suggest that a marketing campaign within Europe would have considerably less chance of success than one aimed at the Indian sub-continent.
  • 17. Other Dark Data Examples: type of device • Other typical examples of dark data, which most websites store, but fail to utilize the value of, include the type of device one accesses the Internet from, typically a smartphone, tablet or computer; the web- browser the Internet is being accessed through, eg Chrome, Mozilla, Opera, Edge or IE, among others, and even more obscure or dark information such as the number of times users re-set their password, which would be useful to a company which specializes in Internet and password security.
  • 18. Other Dark Data Examples: Customer Feedback • A well-known example of dark data which goes to waste is where companies have a feedback form which allows users to give feedback concerning their website or service but then they don't have the data structures in place which allow these data to be easily analysed, resulting in a failure to take on board and act on their users judgments and criticisms, whether positive or negative (both of which have value), that users make about their site or service.
  • 19. How does data go DARK Customer Feedback
  • 21.
  • 22. More dark data examples • Customer Information • Log Files • Account Information • Previous Employee Data • Financial Statements • Raw Survey Data • Email Correspondences • Notes or Presentations • Old Versions of Relevant Documents
  • 23. The dark bites • Maybe we can use all this data later? This also explains why many organizations are reluctant to part with dark data, even if they have no plans to put it to work on their behalf, either in the near term or further down the planning horizon. • The dark can bite, organizations must also be aware that the dark data they possess – or perhaps more chillingly, • the dark data about them, their customers and their operations that's stored in the cloud, outside their immediate control and management – can pose risks to their continued business health and well-being.
  • 24. Problems from the dark • Data stored but not used cost money ( NYT says 90% of energy used by data centers is waster) • Stored data costs money, according to Datamation by 2020 unused but stored can add up to $891 billion • The more data is stored but not used, the higher the risk specially in privacy
  • 25. The risks 1. Legal and regulatory risk. If data covered by mandate or regulation – such as confidential, financial information (credit card or other account data) or patient records – appears anywhere in dark data collections, its exposure could involve legal and financial liability. 2. Intelligence risk. If dark data encompasses proprietary or sensitive information reflective of business operations, practices, competitive advantages, important partnerships and joint ventures, and so forth, inadvertent disclosure could adversely affect the bottom line or compromise important business activities and relationships.
  • 26. The risks 3. Reputation risk. Any kind of data breach reflects badly on the organizations affected thereby. This applies as much to dark data (especially in light of other risks) as to other kinds of breaches. 4. Opportunity costs. Given that the organization has decided not to invest in analysis and mining of dark data by definition, concerted efforts by third parties to exploit its value represent potential losses of intelligence and value based upon its contents. 5. Open-ended exposure. By definition, dark data contains information that's either too difficult or costly to extract to be mined, or that contains unknown (and therefore unevaluated) sources of intelligence and exposure to loss or harm. Dark data's secrets may be very dark and damaging indeed, but one has no way of knowing for sure.
  • 27. Mitigating Risks Posed by Dark Data 1. Know where is dark---ongoing inventory and assessment. 2. Turn dark to light ---drive ongoing research into new tools and technologies 3. understanding where dark data resides, how it's stored, how it's protected and what kinds of access controls help maintain its security. 4. No man land –Ubiquitous encryption.No dark data should be readily accessible to casual inspection, under any circumstances. 5. Don’t stay in the dark too long ---Retention policies and safe disposal. 6. Auditing dark data for security purposes.
  • 28. What are some other major areas in which dark data is being underutilized besides underutilized customer information? • Education and Healthcare. • The potential to service students and patients in the manner in which the consumer and financial services pursue their target population is huge. • So much paperwork is involved in both education and academics, so the data is there—and in the age of electronic health records government incentives, much of it in the healthcare space is now digital. • However, it needs to be mined and analyzed in order to lead to opportunities that effect the change which usually results from the strategic use of personal and behavioral data.
  • 29. What kind of businesses can really benefit from dark data extraction and processing? • Business that sells a product, service or idea—anyone who has customers—can benefit. • How many times • a user resets their password IP address when a user logs into your website/app • Last email communication date to your customers • Mobile handset type, or web browser version • Free text feedback on a hotel stay or recent flight • Additional passengers or guest names on a ticket or hotel room • These data points or features are often overlooked by marketing teams as serving any useful purpose, as there is a perception that this type of information is only collected for compliance, fraud or regulatory requirements
  • 30. How old is too old when it comes to dark data? • Nothing is ever too old unless it is too old • That said, if you’re analyzing, say, customer sentiment in social media, you simply won’t have relevant data that predates the advent of social channels. So in that case, dark data from before those channels existed could be considered “too old.”
  • 31. How can you turn dark data into active, revenue generating data? • This is where data science, marketing, and business intelligence need to get their heads together to find new ways of activating dark data to provide new opportunities for the organization. While dark data can appear dull and uninteresting on the surface; there are methods to turn it into highly granular, rich customer insights. • Here are a few key steps to get you started on the above examples: • Log-ins to your website or mobile application, what city/country are the IP addresses? Are you logging each location a user visit and creating a virtual map of their travels? This is particularly compelling when creating a 360-degree view of your customer. • Additional passengers/guest names on a reservation. Not only does this give insight to homophily of the user and fuel your social network graph of which users are centrally connected and influential, but it also provides rich insight into their family and workplace. Link this data with social graphing, and you’ll quickly obtain age, gender, and behavioral traits.
  • 32. How can you turn dark data into active, revenue generating data? • Mobile phone data. This simple piece of data will illuminate an array of new product and marketing opportunities, and provide an additional segmentation layer to improve marketing effectiveness. From mobile phone data, it’s possible to know which telco partners you should bring on board (which will activate even MORE opportunities), you’ll know where your users are in the world, in real time if they have recently purchased tickets with another airline, and more. • Free text input, such as feedback can be passed through cognitive text analysis tools to determine if the general sentiment of the feedback is positive of negative. Linking the user profile to your internal database can also determine if this user is sending mixed messages on social media compared to surveys and feedback forms. THINK AIRLINE
  • 33. Four Ways to Use Dark Data 1. Networking machine data. As noted above, servers, firewalls, network monitoring tools and other parts of your environment generate large amounts of machine data related to network operations. Avoid dark networking data by using this information to analyze network security, as well as to monitor network activity patterns to ensure that your network infrastructure is never under- or over-utilized. 2. Customer support logs. Most businesses maintain records of customer- support interactions that include information such as when a customer contacted the business, which type of communication channel was used, how long the engagement lasted and so on. Don’t make the mistake of leaving this data in the dark, or using it only when you need to research a customer issue. Instead, build it into your analytics workflows by leveraging it to help understand when your customers are most likely to contact you, what their preferred methods of contact are and so on.
  • 34. Four Ways to Use Dark Data 3. “Legacy” system log. If you have mainframes or other older types of systems running in your environment, you may think that there is no way to use modern analytics tools to understand them. But you can. By offloading system logs and other data from these systems into an analytics platform like Hadoop, you can make sure you are not leaving this “legacy” data in the dark. 4. Non-textual data. Most data analytics workflows are built around textual data, which is easier to ingest. You can also make use of video, audio or other non-textual files, however. You can analyze the meta data associated with them, or, if appropriate, translate speech to text in order to gain more insight into the content of the data itself. The effort required in this regard may not be worth it in all cases, but the bigger point worth keeping in mind is that your non-textual data doesn’t have to be dark data. There are ways to make it actionable if you need it to be.
  • 35. LET THERE BE LIGHT: Dark Data Analytics • Dark analytics efforts typically focus on three dimensions: 1. Untapped data already in your possession 2. Nontraditional unstructured data 3. Data in the deep web • o be clear, the purpose of dark analytics is not to catalog vast volumes of unstructured data. Casting a broader data net without a specific purpose in mind will likely lead to failure. Indeed, dark analytics efforts that are surgically precise in both intent and scope often deliver the greatest value. Like every analytics journey, successful efforts begin with a series of specific questions. What problem are you solving? What would we do differently if we could solve that problem? Finally, what data sources and analytics capabilities will help us answer the first two questions?
  • 36. DeepDive • http://deepdive.stanford.edu/quickstart • DeepDive is a system to extract value from dark data. Like dark matter, dark data is the great mass of data buried in text, tables, figures, and images, which lacks structure and so is essentially unprocessable by existing software. • DeepDive helps bring dark data to light by creating structured data (SQL tables) from unstructured information (text documents) and integrating such data with an existing structured database. • DeepDive is used to extract sophisticated relationships between entities and make inferences about facts involving those entities. • DeepDive helps one process a wide variety of dark data and put the results into a database. With the data in a database, one can use a variety of standard tools that consume structured data; e.g., visualization tools like Tablaeu or analytics tools like Excel. • http://deepdive.stanford.edu/showcase/apps
  • 37. Lessons from the front lines • IU HEALTH’S RX FOR MINING DARK DATA • Retailers make it personal • Oil Company
  • 38. IU HEALTH’S RX FOR MINING DARK DATA • As part of a new model of care, Indiana University Health (IU Health) is exploring ways to use nontraditional and unstructured data to personalize health care for individual patients and improve overall health outcomes for the broader population. • Traditional relationships between medical care providers and patients are often transactional in nature, focusing on individual visits and specific outcomes rather than providing holistic care services on an ongoing basis. IU Health has determined that incorporating insights from additional data will help build patient loyalty and provide more useful, seamless, and cost-efficient care.
  • 39. IU HEALTH’S RX FOR MINING DARK DATA • “IU Health needs a 360-degree understanding of the patients it serves in order to create the kind of care and services that will keep them in the system • For example, consider the voluminous free-form notes— both written and verbal—that physicians generate during patient consultations. • Deploying voice recognition, deep learning, and text analysis capabilities to these in-hand but previously underutilized sources could potentially add more depth and detail to patient medical records. • These same capabilities might also be used to analyze audio recordings of patient conversations with IU Health call centers to further enhance a patient’s records. Such insights could help IU Health develop a more thorough understanding of the patient’s needs, and better illuminate how those patients utilize the health system’s services.
  • 40. IU HEALTH’S RX FOR MINING DARK DATA • Another opportunity involves using dark data to help predict need and manage care across populations. IU Health is examining how cognitive computing, external data, and patient data could help identify patterns of illness, health care access, and historical outcomes in local populations. The approaches could make it possible to incorporate socioeconomic factors that may affect patients’ engagement with health care providers. • “There may be a correlation between high density per living unit and disengagement from health,” says Mark Lantzy, senior vice president and chief information officer, IU Health. “It is promising that we can augment patient data with external data to determine how to better engage with people about their health. We are creating the underlying platform to uncover those correlations and are trying to create something more systemic. • The destination for our journey is an improved patient experience,” he continues. “Ultimately, we want it to drive better satisfaction and engagement. More than deliver great health care to individual patients, we want to improve population health throughout Indiana as well. To be able to impact that in some way, even incrementally, would be hugely beneficial.”
  • 41. Retailers make it personal • Retailers almost universally recognize that digital has reshaped customer behavior and shopping. In fact, $0.56 of every dollar spent in a store is influenced by a digital interaction. • Yet many retailers—particularly those with brick-and-mortar operations— still struggle to deliver the digital experiences customers expect. Some focus excessively on their competitors instead of their customers and rely on the same old key performance indicators and data. • In recent years, however, growing numbers of retailers have begun exploring different approaches to developing digital experiences. Some are analyzing previously dark data culled from customers’ digital lives and using the resulting insights to develop merchandising, marketing, customer service, and even product development strategies that offer shoppers a targeted and individualized customer experience.
  • 42. Retailers make it personal • Stitch Fix, for example, is an online subscription shopping service that uses images from social media and other sources to track emerging fashion trends and evolving customer preferences. • Its process begins with clients answering a detailed questionnaire about their tastes in clothing. Then, with client permission, the company’s team of 60 data scientists augments that information by scanning images on customers’ Pinterest boards and other social media sites, analyzing them, and using the resulting insights to a develop a deeper understanding of each customer’s sense of style. • Company stylists and artificial intelligence algorithms use these profiles to select style-appropriate items of clothing to be shipped to individual customers at regular intervals.
  • 43. Retailers make it personal • Meanwhile, grocery supermarket chain Kroger Co. is taking a different approach that leverages Internet of Things and advanced analytics techniques. As part of a pilot program, the company is embedding a network of sensors and analytics into store shelves that can interact with the Kroger app and a digital shopping list on a customer’s phone. • As the customer strolls down each aisle, the system—which contains a digital history of the customer’s purchases and product preferences—can spotlight specially priced products the customer may want on 4-inch displays mounted in the aisles. This pilot, which began in late 2016 with initial testing in 14 stores, is expected to expand in 2017.
  • 44. GREG POWERS, VICE PRESIDENT OF TECHNOLOGY, HALLIBURTON • Yet the sheer volume of information that we can and do collect goes way beyond human cognitive bandwidth. Advances in sensor science are delivering enormous troves of both dark data and what I think of as really dark data. • For example, we scan rocks electromagnetically to determine their consistency. We use nuclear magnetic resonance to perform what amounts to an MRI on oil wells. Neutron and gamma-ray analysis measures the electrical permittivity and conductivity of rock. Downhole spectroscopy measures fluids. Acoustic sensors collect 1–2 terabytes of data daily. • All of this dark data helps us better understand in-well performance. In fact, there’s so much potential value buried in this darkness that I flip the frame and refer to it as “bright data” that we have yet to tap.
  • 45. GREG POWERS, VICE PRESIDENT OF TECHNOLOGY, HALLIBURTON • In the next phase of Halliburton’s ongoing analytics program, we want to develop the capacity to capture, mine, and use bright data insights to become more predictive. • Given the nature of our operations, this will be no small task. Identical events driven by common circumstances are rare in the oil and gas industry. We have 30 years of retrospective data, but there are an infinite number of combinations of rock, gas, oil, and other variables that affect outcomes. • Unfortunately, there is no overarching constituent physics equation that can describe the right action to take for any situation encountered. Yet, even if we can’t explain what we’ve seen historically, we can explore what has happened and let our refined appreciation of historic data serve as a road map to where we can go. • In other words, we plan to correlate data to things that statistically seem to matter and, then, use this data to develop a confidence threshold to inform how we should approach these issues.
  • 46. GREG POWERS, VICE PRESIDENT OF TECHNOLOGY, HALLIBURTON • We believe that nontraditional data holds the key to creating advanced intelligent response capabilities to solve problems, potentially without human intervention, before they happen. • At the lowest level, we’ll take measurements and tell someone after the fact that something happened. At the next level, our goal will be to recognize that something has happened and, then, understand why it happened. The following step will use real-time monitoring to provide in-the-moment awareness of what is taking place and why. In the next tier, predictive tools will help us discern what’s likely to happen next. The most extreme offering will involve automating the response—removing human intervention from the equation entirely. • Drilling is complicated work. To make it more autonomous and efficient, and to free humans from mundane decision making, we need to work smarter. Our industry is facing a looming generational change. Experienced employees will soon retire and take with them decades of hard-won expertise and knowledge. We can’t just tell our new hires, “Hey, go read 300 terabytes of dark data to get up to speed.” We’re going to have to rely on new approaches for developing, managing, and sharing data-driven wisdom.
  • 47. Where do you start? Ask the right questions: • Rather than attempting to discover and inventory all of the dark data hidden within and outside your organization, work with business teams to identify specific questions they want answered. Work to identify potential dark analytics sources and the untapped opportunities contained therein. • Then focus your analytics efforts on those data streams and sources that are particularly relevant. • For example, if marketing wants to boost sales of sports equipment in a certain region, analytics teams can focus their efforts on real-time sales transaction streams, inventory, and product pricing data at select stores within the target region. They could then supplement this data with historic unstructured data—in-store video analysis of customer foot traffic, social sentiment, influencer behavior, or even pictures of displays or product placement across sites—to generate more nuanced insights.
  • 48. Look outside your organization: • You can augment your own data with publicly available demographic, location, and statistical information. Not only can this help your analytics teams generate more expansive, detailed reports—it can put insights in a more useful context. • For example, a physician makes recommendations to an asthma patient based on her known health history and a current examination. By reviewing local weather data, he can also provide short-term solutions to help her through a flare-up during pollen season. In another example, employers might analyze data from geospatial tools, traffic patterns, and employee turnover to determine the extent to which employee job satisfaction levels are being adversely impacted by commute times.
  • 49. Augment data talent: • Data scientists are an increasingly valuable resource, especially those who can artfully combine deep modeling and statistical techniques with industry or function-specific insights and creative problem framing. Going forward, those with demonstrable expertise in a few areas will likely be in demand. • For example, both machine learning and deep learning require programmatic expertise—the ability to build established patterns to determine the appropriate combination of data corpus and method to uncover reasonable, defensible insights. Likewise, visual and graphic design skills may be increasingly critical given that visually communicating results and explaining rationales are essential for broad organizational adoption. • Finally, traditional skills such as master data management and data architecture will be as valuable as ever—particularly as more companies begin laying the foundations they’ll need to meet the diverse, expansive, and exploding data needs of tomorrow.
  • 50. Explore advanced visualization tools: • Not everyone in your organization will be able to digest a printout of advanced Bayesian statistics and apply them to business practices. • Most people need to understand the “so what” and the “why” of complex analytical insights before they can turn insight into action. In many situations, information can be more easily digested when presented as an infographic, a dashboard, or another type of visual representation. • Visual and design software packages can do more than generate eye- catching graphics such as bubble charts, word clouds, and heat maps—they can boost business intelligence by repackaging big data into smaller, more meaningful chunks, delivering value to users much faster. Additionally, the insights (and the tools) can be made accessible across the enterprise, beyond the IT department, and to business users at all levels, to create more agile, cross-functional teams.
  • 51. View it as a business-driven effort: • It’s time to recognize analytics as an overall business strategy rather than as an IT function. To that end, work with C-suite colleagues to garner support for your dark analytics approach. • Many CEOs are making data a cornerstone of overall business strategy, which mandates more sophisticated techniques and accountability for more deliberate handling of the underlying assets. • By understanding your organization’s agenda and goals, you can determine the value that must be delivered, define the questions that should be asked, and decide how to harness available data to generate answers. • Data analytics then becomes an insight-driven advantage in the marketplace. The best way to help ensure buy-in is to first pilot a project that will demonstrate the tangible ROI that can be realized by the organization with a businesswide analytics strategy.
  • 52. Think broadly: • As you develop new capabilities and strategies, think about how you can extend them across the organization as well as to customers, vendors, and business partners. Your new data strategy becomes part of your reference architecture that others can use.
  • 53. Thank you and May the light shines on you
  • 54. References 1. http://www.kdnuggets.com/2015/11/importance-dark-data-big-data- world.html 2. http://www.kdnuggets.com/2015/01/shining-light-on-dark-data.html 3. http://www.kdnuggets.com/2016/03/rise-dark-data-how- harnessed.html 4. http://www.kdnuggets.com/solutions/fraud-detection.html 5. https://en.wikipedia.org/wiki/Operational_database 6. http://blog.syncsort.com/2017/05/big-data/4-dark-data-examples-use- cases/ 7. Tracie Kambies, Paul Roma, Nitin Mittal, Sandeep Kumar Sharma, https://dupress.deloitte.com/dup-us-en/focus/tech-trends/2017/dark- data-analyzing-unstructured-data.html

Editor's Notes

  1. Lopsided priorities Take the example of a bank analysing online applications for credit cards. The credit card marketing team is focused solely on customer details and eligibility but no attention is paid to the data on how the customer arrived at the application page. The unattended data could have provided valuable insights on the usability of the bank website and the application page. But there is no priority assigned to this aspect. Disconnect among departments In large organizations, departments have their own data collection and storage processes which may not be known to other departments. So, data, even if relevant to other departments, lie unused. This is a process issue obviously. Technology and tool constraints If data collection is done by separate technologies and tools in the same organization, there may be cases that these technologies and tools do not interact with each other because of technological constraints. This prevents bringing all the data together and creating a cohesive picture. This happens especially for companies that have different IT systems and formats. For example, it may be difficult to integrate audio file contents from call center with click data from websites. Companies that are at the early stages of a data analytics program face these problems
  2. Perhaps this particular soft drinks manufacturer is too small to be thinking of expanding into an international market but these same dark data could still be useful to a large, multinational which is thinking of entering the soft drinks market with a very similar product and wondering which would be the best regions in the world in which to enter the market first. They would probably pay good money to receive reliable information indicating that India was a better market for them than the European Union. The soft drinks manufacturer could also sell the dark data they have on to those 3rd parties interested in targetting ads to soft drinks users, for whom IP location and a whole range of other factors which the manufacturer has among its dark data about soft drinks consumers, and would be willing to pay for these data, always assuming that they were well scrubbed up and easily retrievable and analysable.
  3. Having your dark data cleansed to a high data quality you can trust in and well-structured will allow your automatic processes and employees to easily read and analyse them before extracting business intelligence of real value, whether directly for the company involved or to be sold on to 3rd parties. This will allow you to stand out among your competitors, and make you one of the winners in 2017. Sometimes this means updating your website in order to make these data more accessible.
  4. Customer Information Log Files Account Information Previous Employee Data Financial Statements Raw Survey Data Email Correspondences Notes or Presentations Old Versions of Relevant Documents
  5. According to the New York Times, 90% of energy used by data centres is wasted. If data was not stored, energy costs could be saved. Furthermore, there are costs associated with the underutilisation of information and thus missed opportunities. According to Datamation, "the storage environments of EMEA organizations consist of 54 percent dark data, 32 percent Redundant, Obsolete and Trivial data and 14 percent business-critical data. By 2020, this can add up to $891 billion in storage and management costs that can otherwise be avoided." The continuous storage of dark data can put an organisation at risk, especially if this data is sensitive. In the case of a breach, this can result in serious repercussions. These can be financial, legal and can seriously hurt an organisation's reputation. For example, a breach of private records of customers could result in the stealing of sensitive information, which could result in identity theft. Another example could be the breach of the company's own sensitive information, for example relating to research and development. These risks can be mitigated by assessing and auditing whether this data is useful to the organisation, employing strong encryption and security and finally, if it is determined to be discarded, then it should be discarded in a way that it becomes unretrievable.
  6. Ongoing inventory and assessment. Dark data holdings should be recognized and subject to periodic reconnaissance. They should also drive ongoing research into new tools and technologies to help extract value from such data. Yesterday's dark data may become a shining source of insight, thanks to new tools or analytic techniques. Somebody needs to keep an eye on such things and be ready to put them to work when the benefits of their use outweighs their costs. In addition, performing a regular inventory requires understanding where dark data resides, how it's stored, how it's protected and what kinds of access controls help maintain its security. Ubiquitous encryption. Any digital asset with potential value and possible risk must be stored in encrypted form, whether on the organization's premises and equipment or elsewhere in the cloud. No dark data should be readily accessible to casual inspection, under any circumstances. Strong encryption should make it extremely difficult for those who do manage to obtain dark data to unlock its contents, and equally strong access controls and monitoring should make it obvious who can (and has) access such information for any purposes whatsoever.
  7. For-profit businesses tend to see the value of dark data right away, whereas it’s less obvious or immediate for the nonprofit sector. But imagine if these two sectors were mirrored after the customer service or financial service industries in regards to the way they use data to make businesses decisions
  8. Find relations between light and dark----Correlating traditionally dark data sources with light data sources is a valuable tool and is worth allocating the necessary resources toward.
  9. There’s no such thing. As long as you have the space and capacity to store data, you should, regardless of age. You never know when those unstructured call center notes from 1993 will come in handy. Creating a data library is invaluable, and it will only grow as your business grows. Having historical data will allow you to analyze relevant shifts and trends in demand, as well as predict and plan for the future based on what you’ve observed in the past.
  10. Answering these questions makes it possible for dark analytics initiatives to illuminate specific insights that are relevant and valuable. Remember, most of the data universe is dark, and with its sheer size and variety, it should probably stay that way.