Top Data Mining
Techniques
And how to source data via web crawling
services for maximum business value.
What is Data Mining?
Process of analyzing large-
scale data to identify hidden
patterns and trends for
understanding, guiding and
forecasting future
behaviour.
Why Data
Mining is
Important?
With the advent of Big Data, it
has become a valuable financial
asset for any company.
Data mining can help companies
understand consumer behaviour,
predict trends, perform market
analysis…
leading to better
business decisions,
improved revenue
and reduced cost.
The Data Mining Process
Data extraction
01
Transformation
02
Loading onto
data
warehouse
03
Providing Acess
04
Analyses
06
Presentation
07
Data extraction is the
foundation of data
mining.
While your organisation
might have large
volume of internal data,
there is another source
of the data that you
must not miss out, i.e.,
web data.
In essence, web data
can augment your
existing data to provide
holistic view of your
business, which should
be the basis of any data
mining project.
Leverage the cloud-based,
managed Data as a Service
providers such as
PromptCloud, who have
already set up a Big Data
infrastructure and the
technology stack required for
custom web data extraction.
How to
Source Web
Data
This is important as a DaaS
provider can eliminate the
requirement of engineering
talents at your end and take
away the pain of maintaining
data feeds (considering the
frequent structural changes in
the web pages).
All you need to do is focus on
consuming the data for
business growth.
Get started by providing the following
information
Websites you’re looking
to crawl
Relevant data fields
Desired frequency of the
crawl
(daily/weekly/monthly)
Key Data Mining
Techniques
Association
This technique is
used for establishing
correlation between
two or more items
to discover patterns.
Association
Can be useful for bundling products,
in-store product placement and
analysing imperfections. For
example, you might identify that
when people buy banana, they tend
to buy milk as well, and therefore
you can suggest them milk next
time they purchase banana.
Classification
This technique is used
for identifying specific
classes of customers
or products by using
the associated
attributes.
Classification
Can be useful for
identifying customers who
are likely to purchase or
not likely to purchase,
customers who are most
valuable, customers who
respond to specific type of
advertisement, etc.
Clustering
This technique is used
for exploring data and
applying one or more
attributes for finding
innate correlations
among members of a
cluster.
Clustering
The applications of
clustering lies in
identifying new
customer segments,
grouping of similar
sites by search
engines, recognising
similarity in genetic
data from population
structure and more.
Outlier Detection
This technique is
used for identifying
unusual or
suspicious cases that
deviate from the
projected pattern or
expected norm.
Source: http://bit.ly/2sGcIum
Outlier Detection
The applications of
outlier or anomaly
detection lies in
identification of
credit fraud,
taxation fraud, etc.
Regression
Analysis
This technique is
used for establishing
the dependency
between two
variables so that
causal relationship
can be used to
predict outcome
one variable.
Regression Analysis
The examples of
applications of
regression analysis are
prediction of customer
lifetime value resulting
from loyalty, effect of
real estate market on
GDP, etc.
Attribute Importance
This technique is used
for identifying the
association strength of
certain attributes with
target attributes.
Attribute Importance
The examples of applications of
attribute importance include
finding factors highly associated
with customers who respond to
certain promotion, factors most
associated with high performing
employees.
Source: www.hrthatworks.com
Feature Selection
This technique is used
for creating new
attributes by
performing linear
combination of existing
attributes.
Feature Selection
The examples of
applications of
feature selection are
latent semantic
analysis, data
compression and
pattern recognition
and more.
Parting Thoughts
More Data
Better Data
Mining
Models
Better
Business Value
Get Started
with Web
Data
Extraction!
Reach out to us at sales@promptcloud.com.

Top Data Mining Techniques and Their Applications

  • 1.
    Top Data Mining Techniques Andhow to source data via web crawling services for maximum business value.
  • 2.
    What is DataMining? Process of analyzing large- scale data to identify hidden patterns and trends for understanding, guiding and forecasting future behaviour.
  • 3.
    Why Data Mining is Important? Withthe advent of Big Data, it has become a valuable financial asset for any company.
  • 4.
    Data mining canhelp companies understand consumer behaviour, predict trends, perform market analysis…
  • 5.
    leading to better businessdecisions, improved revenue and reduced cost.
  • 6.
    The Data MiningProcess Data extraction 01 Transformation 02 Loading onto data warehouse 03 Providing Acess 04 Analyses 06 Presentation 07
  • 7.
    Data extraction isthe foundation of data mining.
  • 8.
    While your organisation mighthave large volume of internal data, there is another source of the data that you must not miss out, i.e., web data.
  • 9.
    In essence, webdata can augment your existing data to provide holistic view of your business, which should be the basis of any data mining project.
  • 10.
    Leverage the cloud-based, managedData as a Service providers such as PromptCloud, who have already set up a Big Data infrastructure and the technology stack required for custom web data extraction. How to Source Web Data
  • 11.
    This is importantas a DaaS provider can eliminate the requirement of engineering talents at your end and take away the pain of maintaining data feeds (considering the frequent structural changes in the web pages).
  • 12.
    All you needto do is focus on consuming the data for business growth.
  • 13.
    Get started byproviding the following information Websites you’re looking to crawl Relevant data fields Desired frequency of the crawl (daily/weekly/monthly)
  • 14.
  • 15.
    Association This technique is usedfor establishing correlation between two or more items to discover patterns.
  • 16.
    Association Can be usefulfor bundling products, in-store product placement and analysing imperfections. For example, you might identify that when people buy banana, they tend to buy milk as well, and therefore you can suggest them milk next time they purchase banana.
  • 17.
    Classification This technique isused for identifying specific classes of customers or products by using the associated attributes.
  • 18.
    Classification Can be usefulfor identifying customers who are likely to purchase or not likely to purchase, customers who are most valuable, customers who respond to specific type of advertisement, etc.
  • 19.
    Clustering This technique isused for exploring data and applying one or more attributes for finding innate correlations among members of a cluster.
  • 20.
    Clustering The applications of clusteringlies in identifying new customer segments, grouping of similar sites by search engines, recognising similarity in genetic data from population structure and more.
  • 21.
    Outlier Detection This techniqueis used for identifying unusual or suspicious cases that deviate from the projected pattern or expected norm. Source: http://bit.ly/2sGcIum
  • 22.
    Outlier Detection The applicationsof outlier or anomaly detection lies in identification of credit fraud, taxation fraud, etc.
  • 23.
    Regression Analysis This technique is usedfor establishing the dependency between two variables so that causal relationship can be used to predict outcome one variable.
  • 24.
    Regression Analysis The examplesof applications of regression analysis are prediction of customer lifetime value resulting from loyalty, effect of real estate market on GDP, etc.
  • 25.
    Attribute Importance This techniqueis used for identifying the association strength of certain attributes with target attributes.
  • 26.
    Attribute Importance The examplesof applications of attribute importance include finding factors highly associated with customers who respond to certain promotion, factors most associated with high performing employees. Source: www.hrthatworks.com
  • 27.
    Feature Selection This techniqueis used for creating new attributes by performing linear combination of existing attributes.
  • 28.
    Feature Selection The examplesof applications of feature selection are latent semantic analysis, data compression and pattern recognition and more.
  • 29.
    Parting Thoughts More Data BetterData Mining Models Better Business Value
  • 30.
    Get Started with Web Data Extraction! Reachout to us at sales@promptcloud.com.