SlideShare a Scribd company logo
1 of 14
Assignment 2: ClusterAnalysisand
Predictive Modelling
BUS5PA - 19139507
SIDDHANTH CHAURASIYA 19139507
1 |
P a g e
1 9 1 3 9 5 0 7
PART A - SEGMENTATION BASED EXPLORATION OF CUSTOMERS
---------------------------------------------------------------------------------------------------------------------------
Thissectionof the report containsthe explorationandfindingsfromthe segmentationand
clusteringanalysisconductedonCHURN_TELECOMdataset,usingSASMiner.Three typesof
segmentation were carriedoutonthe basisof distinctdeterminants –Demographics,Customer
Statusand CustomerUsages.
Demographics basedProfiling:
Aftercreatingthe project, library anddiagram,we add the data-source andsetthe rolesof all the
variablesasinputexceptforChurnFlag(Target),Customerandsubscriptionidentifier(ID) and
Subscribername (Text).
We drag the data-source intothe diagramand connectitwithClusterandSegmentprofilesnodes.
Age,GenderandCustomerValue are selectedasthe variablesforthe Clusteraswell asthe Segment
Profile node forthisprofilingactivity. CustomerValuecontains68% missingvalues,andthus
imputingthose missingvalueswithasyntheticvalue (mean,median,max,etc.) wouldcreate avery
skeweddistribution;whichisn’tdesirable.Hence,CustomerValueisn’timputed.
Figure 1: Process flow for Demographics based Profiling.
Since the measurementscalesof the variablesselectedasthe inputfor Demographical profilingare
different,we keepthe methodfor‘Internal Standardization’as ‘Standardization’fromthe properties
panel of the Clusternode. Restall propertiesof nodesClusterandSegmentProfile are keptat
default.
Figure 2: Cluster and Segment Profile results for Demographical segmentation.
We founda goodcombinationof clusterswithfairamountof observationsineachsegment(Figure
2) aftersettingthe numberof clustersas4. The four segmentscouldbe broadlyclassifiedas:
2 |
P a g e
1 9 1 3 9 5 0 7
Cluster 1 – ValuableYoung Adults.
Thissegmentcanbe describedasa groupof Maleswhoare justabout start theirprofessional
careersand generate highcustomervalue forthe organization. Since thisclustershow the tendency
of highcustomervalue,the companyshould ensure retentionof thissegment.
Cluster 2 – Distressed Damsel.
Thisclustercan be bestexpressedasa segment of juvenileFemaleswhoaccumulateforarelatively
lowerCustomerValue. Thissegmentaccountsforlowercustomervalue,whichmaybe anindicator
that customersaren’tsatisfiedwiththe servicesofferedandmaychurninthe future.The company
shoulddevise plans,offersanddiscountstonegate the chancesof churnof thiscluster.
Figure 3: Results from Segment Profile node.
Cluster 3 – Stingy Seniors.
Thisgroup ischaracterizedbyseniormales whogenerate low valueforthe Telecomcompany. As
such,customersbelongingtothis segmentmayneedspecial attentionsastheyhave highlikelihood
of churning, asindicatedbytheirlowcustomervalue generation.
Cluster 4 – Bankableladies.
Thisclusteris classifiedbyelderwomenwhoproduce highvalueforthe company. The company
shouldlooktomaximize the value derivedfromthissegment.
3 |
P a g e
1 9 1 3 9 5 0 7
Figure 4: Variable significance for each cluster.
As observedfromFigure 4, Genderwasthe mostinfluential variable forthe classificationof
DistressedDamsel,StingySeniorsandBankable ladieswhileAge hasthe mostsignificance for
Valuable YoungAdults.
Note:The variable CustomerValuewasonlycollectedforcustomerswho were identifiedashaving
highprobabilityof churning.Customervalue wasn’tcollectedforcustomerswhohadlow probability
of churning.Assuch,these leadstoa distortedanalysisforcluster.However,since we don’thave
sufficientdemographical variables,we stilluse CustomerValue forthe clustering.
CustomerStatus basedProfiling:
To conduct CustomerStatusbasedsegmentation,we optforvariables whichhighlightwhatthe
statusof the customeriswithreference tothe servicesofferedbythe company. Email queriessent,
revenue throughGPRS,internet,&fix-lineanddayssince lastcomplain are the variableswhichare
selected. ThroughStatExplore we foundoutthe distributionof the latterfourselected variables
were highlyskewed, andthus we normalizethemusingTransformvariablesnode.
Figure 5: Process flow for Customer Status based Profiling.
Settingupof 4 clustersledtoan excellentcreationof fairlyequalsegments. The fourclusterscould
be interpreted as:
Cluster 1 – Superactive
Thisclusterischaracterizedbycustomerswhotendto conversate backand forthwiththe company
throughemailsquite oftenbuthaven’treallyhadacomplaintregardingthe servicesrecently.
Additionally,these customersgenerate arelativelyhigherrevenue throughinternet,GPRSaswell as
fix-line services.As such,the customersfrom these segments are very importantfromprofitability
4 |
P a g e
1 9 1 3 9 5 0 7
pointof view.
Cluster 2 – Curious
Customersfromthisclustercanbe describedasbeing quite curiousaboutthe new plans,asevident
fromtheirhighnumberof email queriessentinthe past6 months.Similarly,theyhave lodgeda
complaintveryrecentlyandproduce ahighrevenue throughthe internetmediumforthe company.
Thus,theyhave beenaptlynamedas‘Curious’. Thiswill needspecialattentionfromthe
organization,asitshowssignsof churning,
Figure 6: Results from Segment Profile node.
Cluster 3 – Content
Customersbelongingtothissegmenthave rathersatisfiedwiththe servicesandhave laidback
attitude.These customersdon’tgenerallysendinemailqueriesandhaven’tmade acomplaintwith
the companyrecently.The cashinflowgeneratedbythese particulars customersisidentical tothe
overall distributionof the customersacrossthe whole dataset.
Cluster 4 –Transitionals
‘Transitionals’representsaclusterof customerswhotransitioningtothe modernservicesofferedby
the company.Theyhave made a complaintfairlyrecentlybutdon’tgenerallysendmuchemailsto
the organization.The revenue generatedthroughinternetbythemisonthe lowerside butthey
produce highrevenue throughfix-linesandGPRS.
Days since lastcomplaintwasverysignificantvariablesforclusters‘SuperActive’ and‘Transitionals’,
while emailsquerieswerestrongdeterminantsforvariables‘Curious’and‘Content’(Appendix -
Figure 11).
5 |
P a g e
1 9 1 3 9 5 0 7
Usage based Profiling:
To conduct usage basedprofiling,we selectvariableswhichhighlightusage pattern –outgoing
national,international,roaming&local calls,change inbill andrevenue throughinternet andfixline.
Since these variableswere highlyskewed,we usedtransformvariablestonormalize their
distribution.
Figure 7: Process flow for Customer Status based Profiling.
Since we convertedall the variablesinlog,we setthe ‘Standardization’tonone.We setthe number
of clusterto4. The resultswere interpretedas:
Cluster 1 – Cosmopolitan
Thisclusterischaracterizedbycustomers whohave a highusage of outgoinginternational calls.Rest
of the usageslike national calls,local calls,roamingcallsandinternetforthissegmentissame asthe
patternof customersacrossthe dataset.Assuch, the companyshouldoffercustomersfrom this
clusterplanswhichmore attractive forinternational calling,if theywanttoretaintheminlong-run.
Cluster 2 – Connected
Customersfromthisclustertendtohave a highusage of outgoingcallsat national level.Theirusage
of otherservicesis prettymuchsimilartothe overall usage patternof the customers.Churning
customersfromthissegmentcanbe luredback by offeringthemvalue-for-moneyplansfornational
calling.
6 |
P a g e
1 9 1 3 9 5 0 7
Figure 8: Results from Segment Profile node.
Cluster 3 - Traditionals
Thisclusterisdescribedas‘Traditionals’sincetheirusage patternstaysthe same throughout,as
evidentfromtheirlowpercentage change inbills.Theirutilizationof nationalandinternational calls
stayson the lowerside thoughtheyuse ahighamount of internet.
Cluster 4 –Modern
Thissegmentisdescribedas‘Modern’asit ischaracterizedbythe usage of contemporarycustomers
– fluctuatingbills,low usage of calls(national,local,international &roaming) andhighinternet
usage.
7 |
P a g e
1 9 1 3 9 5 0 7
Figure 9: Variables' influence on each cluster.
Cross-clusteranalysis:
AftercreatingrespectiveclustersbasedonDemographics,CustomerStatusandUsage,we conducta
cross-clusteranalysistoexplore if there’s anyassociationbetweenthese segments;whichcould
potentiallybe harnessedintosomethingprofitable forthe company.
We addthe Save data node tothe Clusternodesandexportthe datafor segmentfromall three
categories.UsingVLOOKUPfunctioninexcel,we arrivedatthe followingobservation:
Demographic
ValuableYoungAdults Distressed Damsel StingySeniors Bankableladies
Usage
Cosmopolitan 28.79% 23.51% 21.85% 22.92%
Connected 22.06% 29.47% 23.69% 25.45%
Traditionals 25.63% 22.29% 27.66% 35.29%
Modern 24.59% 25.78% 25.23% 21.02%
Cross-cluster analysis:Demographic vs Usage
It was seen ‘Valuable YoungAdults’and ‘Cosmopolitan’sharedagoodassociation,indicatingthat
youngmentendto use international callingfrequently.Similarly,itwasobservedwomeninlatter
stages(‘Bankable ladies’) hadavery‘Traditional’usage i.e.theirbillsrarelyfluctuatedandtheyused
the callingfeaturesasmuchas the overall average. Lastly,DistressedDamselwascloselyrelatedto
‘Connected’,whichmeanstheyare prettyactive intermsof outgoingscallsnationally. Theseinsights
can be usedveryeffectivelytogaincompetitiveadvantage andimprove the offeringstothe
respective customers.A lotof businessvaluecanbe derivedbycorrectinterpretationandproper
actionsoverthem.
Cross-clustershighlightedinredrepresentthe group whichhave a high-chance of churning(derived
usingChurnFlagvariable andVLOOKUP).Assuch,it isimperative thatcompanyofferssuch
customersgooddiscountsandplans dependingupontheirusages soontoretainthem.
8 |
P a g e
1 9 1 3 9 5 0 7
Cross-clusteranalysisbetweenCustomerStatusandUsage helpedustodiscoversome hidden
insights.
Customer Status
Super
active Curious Content Transitionals
Usage
Cosmopolitan 22.37% 23.67% 26.39% 24.23%
Connected 33.42% 27.43% 22.13% 25.16%
Traditionals 23.92% 23.54% 30.03% 21.37%
Modern 25.41% 29.83% 24.22% 27.55%
Cross-cluster analysis: Customer Status vs Usage
‘SuperActive’customerstendtobe involvedinalotof interactioninternationally(‘Cosmopolitan’),
while ‘Curious’customershadaverymodern-like usage.Similarly,customerswhohave been
categorizedas‘Content’hada lotincommon with‘Traditional’usages.Assuch,the companies
shouldkeepthese insightsinmindandprepare planstomaximize profitoutof such groupof
customers.
On the otherhand,customersbelongingto‘SuperActive’clusterwithusage of ‘Traditional’have a
highprobabilityof churning.Additionally,‘Transitionals’withhighinternationalcallingusage and
national callingusage mayleave the companysoonersratherthanlater. Thus,the companyneedsto
dishout offersanddiscountsaccordingly,basedonthe usage patternsasmentionedabove,to
retainthose customers.
PART B – EXTENDING KNOWLEDGE OF PREDICTIVE ANALYTICS
---------------------------------------------------------------------------------------------------------------------------
Sevenreasonsfor Predictive Analytics
Since the turn of the millennium, andespeciallyinthe lastdecade orso,there has an unprecedented
generationof data,whetherstructuredorunstructured.Infact,IBMhas statedthat suchlarge of
volumesof dataisgeneratedevery day thatthe amountof data doublesupeverytwoyears.
In thisgiganticamountof data liesnumeroushiddenpatternsandtrends,whichif harnessedinthe
rightmannercouldresultinbusinessvalue of epicproportionsforthe organization. Predictive
analyticsisone suchtool thatcan exploitthese giganticdatato conjure upwithmeaningful and
actionable insights.
Eric Siegel,anaccomplishedheavyweightinthe fieldof PredictiveAnalytics,putforwardhis
thoughtsonPredictive Analyticsinawhite paper,statingpreciselywhythe worldneedstoembrace
Predictive Analytics. AsperEricSiegel,adoption,implementationandapplicationof Predictive
Analyticscanenable anorganizationtoachieve the followingsevenobjectives -
 Compete:Gaincompetitive advantageoverrivals.
 Grow: Increase sales,expandcustomerbase andretainexistingcustomers.
 Enforce:Detectfrauds,anomalies andundesirablecircumstances.
9 |
P a g e
1 9 1 3 9 5 0 7
 Improve:Enhancement&refinementin core productofferings,processautomationand
resourcesoptimization.
 Satisfy:Provide tailoredsolutions andrecommendationsforcustomers.
 Learn: Learningfromthe pastdata (structuredas well asunstructured) toprovide insights
and foresightsaboutthe future.
 Act: Actionable recommendations &insights.
Case Study II – Predictive Analyticsfor Insurers
Insurance company’s operatingsuccess chiefly reliesonitsforecastingcapabilities.The primary
distinguisherbetweenthe bestandthe restof the insurance companies isthe accuracy withwhich
the organization cantarget the potential customers,setthe pricingof the premiumanddetect
fraudulentclaims.Muchof these taskswere carriedouton the basisof guestimatesinthe olden
days;a methodwhichwasn’treallyefficientorcost-effective.
Soon,keydeterminantslike age andhistorybecame the foundationonwhichinsurance companies
forecastedits operations.However,today, Predictive Analyticshaschangedthe entire landscape of
howinsurance companiesconductedtheiroperations.
Withthe helpof PredictiveAnalytics,insurance companieshave notonlybeenabletoimprove their
core operations(e.g. Creditscores,frauddetection)butalsomarketingof the product(basedupon
buyingpatternsi.e.hitratio,retentionratio) andunderwriting(filteringoutcustomerswhodonot
meeta givencriteria,therebysavingtime andmoney).
RelatingCase Study II to sevenreasons for Predictive Analytics
The applicationof PredictiveAnalyticsisveryprevalentinthe insurance landscape;andisinfact
consideredasindustry bestpractise.The businessvalue thatcanbe derivedfromutilizationof
Predictive Analyticsinthe fieldof Insurance is tremendous. Afterthoroughlyanalysingthe given
Insurance case study,we couldsummarize how usage of PredictiveAnalyticsbyInsurance firms
enabledthemtoachieve the outcomesdescribedbyEricSiegel as:
Compete:
Insurance industryis verycompetitive,withcompaniesalwaysiteratingtostayone stepaheadof
the rivals.PredictiveAnalyticscanenable anorganizationtogatherknowledge aboutthe customers
ina more holisticmanner,whichcancreate a competitiveadvantage forthe firm.Similarly,
Predictive Analyticscancreate creditscore rating models,adverseselectionmodelsandsoon,which
will aidthe organizationtostayaheadof theirrivals.
Grow:
The insurance industryhaswitnessedsnail-pacedgrowthoverthe pastfew years.Thishasledto
organizationsexploringthe optionstoexpandtonew horizonsandlocations.Withthe helpof
Predictive Analytics,insurancecompaniescanpredictthe whichcustomersare likelytorespondto
offersandmarketingcampaigns. Similarly,throughPredictive Analytics,canunderstandthe buying
pattern,whichcan be usedformarketing’shitratioandcustomerretentionratios.
Enforce:
One of the mostsignificantfunctionforanyinsurance companiesis detectionof fraudulentclaims.
10 |
P a g e
1 9 1 3 9 5 0 7
Withthe helpof scoringandrankingmodels,Predictive Analyticscanhighlightwhichclaimsare a bit
suspiciousandneedmore investigationbefore settlement.
Improve:
Predictive Analyticscanimmenselyaidthe operating efficiencyandproductofferingof aninsurance
company. Throughpredictive models,insurerscanidentifyatthe initial stage itselfwhichclaimsare
likelytobe settledforhighvalue inthe future. Thiswill allow the companytorunits operations
more efficientlyandinamore economical manner.Additionally,Predictive Analyticscanfindout
whichcustomersmeetthe stipulatedobligationsforthe insurance andwhichcustomersdonot.This
helpsinsavingtime,moneyandresources of the organization.
Satisfy:
To maximize the customervalue,insurersneedtopitchthe righttype of insurance (lifeinsurance,
vehicle insuranceandsoon) to the customer. By observingthe buyingpatternsof the customers,
Predictive modelscansuggestthe rightfitof insurance individuallyforeachcustomer.Similarly,
Predictive modelscanassignariskscore foreach customerdependinguponvariousdeterminants
(age,location,history,etc.).These scoresthenenable the companytosetappropriate premium
pricingforthe customersaccordingly.
Learn:
Predictive Analyticsusessophisticatedmodelstofindoutpatternsandtrendsinthe dataset.As
such,usage of Predictive modelslike Linearregression,logitregression,decisiontreesandsoon can
enable the insurers tofindif anypatternexistsbetweenthe variables.Thisinformationcanbe used
for variousoperational activities.
Act:
The insightsandforesightsgeneratedbythe Predictive modelscanaddgreatbusinessvalue if they
are implementedbythe organization.Insurershave beenproactivelyactingonthe insights
producedbyPredictive Analytics.Frauddetection,customerretention,churnanalysis,adverse
selectionare some of the modelsthathave beencreatedthroughPredictivemodellingandbeen
actedupon bythe insurance companies.
Commenton sevenreasons for Predictive Analyticsand its relationwith Churn Case Study
The sevenreasonsof Predictive AnalyticsstatedbyEricSiegel addsdefinitevalue toPredictive
Analyticsproject.The steps mentionedby‘Dr.Data’ are comprehensiveanddescribe the benefits
that couldbe derivedfromaPredictive modelata veryminute level.
From the above Case StudyaboutInsurance,we couldobserve andrelate areal-life applicationof
the sevenreasonsforPredictiveAnalyticsandhow itprovedadvantageoustothe industry.
The sevenreasonsforPredictive Analyticscanalsobe witnessedinChurnCase study inparts.The
churn analysisenablesthe Telecomcompany togaincompetitiveadvantage (‘Compete’) overits
rivalsas itcouldact uponthe highchurn customersandretainthem (‘Grow’) byofferingthemoffers
and discounts (‘Act’) while theircompetitors whodon’tuse PredictiveAnalytics won’tbe able to
retaintheirhighchurningcustomers
The DecisionTree andRegression modelswere builtusingpastdata(‘Learn’).The DecisionTree
11 |
P a g e
1 9 1 3 9 5 0 7
model wasthenused onthe new datasetwiththe helpof Score node todetectwhichcustomersare
on the verge of churning(‘Enforce’).
Eventhoughthe model flagscustomershavinghighprobabilityof churn,the case studydoesn’t
reallyfollow ‘Improve’ asthe model doesn’tenhance the core productofferingbutjustindicate
whichcustomersmaybe unhappywiththe services.Similarly,the case studydoesn’tfollowthe
‘Satisfy’ asit cannotsuggesttailoredsolutionstoindividual customersbutcan onlysuggestwhich
customersshouldbe offeredadiscounttoretainthem.
SEMMA
SEMMA (Sample,Explore,Modify,Model andAssess) isa methodologyformulatedby SASinstitute,
to conductany data miningtasksonits software, SASEnterprise Miner. SEMMA isconcernedwith
the model developmentaspectsof data-mininginSASMiner,anditsadherence ensuresend-to-end
coverage of the core data miningprocesses;whichdirectlyleadstomore informedandaccurate
analysis.
However,due tolackof concrete approachestowardsdatamining processflow (otherthanCRISP-
DM), SEMMA isfollowedbymanyanalyststoconductdata miningactivities. SEMMA standsfor-
Sample: Everydata miningactivityshouldstartwithsamplingof the datasetintotraining,validation
and testsets,ensuringthere’senoughinformationtocarry all these tasks.
Explore:In thisstage,we investigateandexplorethe variablestodiscoverinformationandpatterns
that may existbetweenthe variables.
Modify:Atthisstage,we selectappropriate methodstomodify,transformandrectifyvariablesthat
wouldbe usedinthe modelling.
Model: Afterexplorationandmodificationof variables,we applythe modelling technique onthe
selectedvariables.
Assess: At the lastphase,we evaluate the accuracyand predictingcapabilitiesof the models.
Relationto ChurnCase Study
SASproposedthatSEMMA isthe core processof conductinga data miningactivity. Itcanbe
observed fromFigure 10, the churn case studyreligiouslyfollowedthe SEMMA principles. The Churn
analysiscommenceswith DataPartitionnode (Sample),whichenablesustocreate sample fromthe
datasetand allocate sufficientenoughdatafortraining,validationandtest. Thisisthenfollowedby
imputationof missingvaluesandreplacementof variabletoreduce its numberof classes(Modify).
To reduce the redundancy,we utilizethe Variable Clusteringnode (Explore) andthenrunour
DecisionTree andRegressionmodels(Model).ThroughModel Comparison(Assess),we compare the
twopredictive modelsandfindoutsomethingpeculiar.Toinvestigateitfurther,we use Multiplot
node (Explore) anddetectabnormal variableswhichaffectedthe predictivecapabilitiesof the
model.
12 |
P a g e
1 9 1 3 9 5 0 7
Figure 10: Process flow of Churn Case Study
Usingmetadata(Modify),we remove theseabnormal variablesandre-connectthe Decisiontree and
Regressionmodels(Model) toit. Then,we againuse the Model Comparisonnode (Assess) togauge
whichmodel outperformsthe other.Finally,we use the Score node (Assess)toapplythe bestmodel
to the newdatasetand complete the dataminingprocess.
Thus, it couldbe concludedthatall the stepsof SEMMA were comprehensivelycoveredbythe
Churncase study.
The adherence of SEMMA inthe Churn Case studycan be summarizedas:
Steps Nodes
Sample Data Partition
Explore Multiplot, Variable Clustering
Modify Impute, Replacement, Metadata
Model Decision Tree, Regression
Assess Model Comparison, Score
Relating SEMMA with Churn Case Study.
Importance of SEMMA
EventhoughSASinsistsSEMMA is merelyasetof guidelinestobe followedforSASminer,the
methodology’sapplicationcanbe extendedtodataminingtasksasa whole.SEMMA is a veryrobust
approach thatencompassesall the chief criteriarequired forundertakingorbuildingacomplex
predictive model.Adherence of SEMMA ensuresease of processflow,detectionof faultsand
creationof more accurate models.
ChurnCase studyhugelybenefittedbyfollowingthe SEMMA methodology. ThroughMultiplotand
Variable Clusteringwe could ‘explore’ erroneousvariablesandredundantvariablesandthrough
impute,replacementandmetadata,we could ‘modify’ suchvariables.Model Comparison enabledus
to compare,contrastand ‘assess’ the twopredictive ‘models’ –DecisionTree andRegression.With
the helpof Score node, we evaluatedandappliedthe model toanew dataset.
Sample
Samp
le
Modify
y
Samp
le
Sample
Model
Samp
le
Explore
Samp
le
Assess
y
Sam
ple
Sample
13 |
P a g e
1 9 1 3 9 5 0 7
Appendix
Figure 11: Most significant variables for each of the four clusters.

More Related Content

What's hot

Fraud detection ML
Fraud detection MLFraud detection ML
Fraud detection MLMaatougSelim
 
Data science presentation
Data science presentationData science presentation
Data science presentationMSDEVMTL
 
Net spam a network based spam detection framework for reviews in online socia...
Net spam a network based spam detection framework for reviews in online socia...Net spam a network based spam detection framework for reviews in online socia...
Net spam a network based spam detection framework for reviews in online socia...CloudTechnologies
 
House Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning AlgorithmHouse Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning Algorithmijtsrd
 
Loan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approachLoan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approachEslam Nader
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment AnalysisRebecca Williams
 
Identifying customer segments using machine learning
Identifying customer segments using machine learningIdentifying customer segments using machine learning
Identifying customer segments using machine learningKnoldus Inc.
 
Introduction to Web Mining and Spatial Data Mining
Introduction to Web Mining and Spatial Data MiningIntroduction to Web Mining and Spatial Data Mining
Introduction to Web Mining and Spatial Data MiningAarshDhokai
 
key distribution in network security
key distribution in network securitykey distribution in network security
key distribution in network securitybabak danyal
 
Ml8 boosting and-stacking
Ml8 boosting and-stackingMl8 boosting and-stacking
Ml8 boosting and-stackingankit_ppt
 
Final spam-e-mail-detection
Final  spam-e-mail-detectionFinal  spam-e-mail-detection
Final spam-e-mail-detectionPartnered Health
 
Design cycles of pattern recognition
Design cycles of pattern recognitionDesign cycles of pattern recognition
Design cycles of pattern recognitionAl Mamun
 
Cross validation
Cross validationCross validation
Cross validationRidhaAfrawe
 
Introduction to recommender systems
Introduction to recommender systemsIntroduction to recommender systems
Introduction to recommender systemsRami Alsalman
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSumit Raj
 
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Salah Amean
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data MiningValerii Klymchuk
 

What's hot (20)

Fraud detection ML
Fraud detection MLFraud detection ML
Fraud detection ML
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
Net spam a network based spam detection framework for reviews in online socia...
Net spam a network based spam detection framework for reviews in online socia...Net spam a network based spam detection framework for reviews in online socia...
Net spam a network based spam detection framework for reviews in online socia...
 
House Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning AlgorithmHouse Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning Algorithm
 
Loan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approachLoan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approach
 
Telcom churn .pptx
Telcom churn .pptxTelcom churn .pptx
Telcom churn .pptx
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment Analysis
 
Identifying customer segments using machine learning
Identifying customer segments using machine learningIdentifying customer segments using machine learning
Identifying customer segments using machine learning
 
Introduction to Web Mining and Spatial Data Mining
Introduction to Web Mining and Spatial Data MiningIntroduction to Web Mining and Spatial Data Mining
Introduction to Web Mining and Spatial Data Mining
 
Clustering
ClusteringClustering
Clustering
 
key distribution in network security
key distribution in network securitykey distribution in network security
key distribution in network security
 
Ml8 boosting and-stacking
Ml8 boosting and-stackingMl8 boosting and-stacking
Ml8 boosting and-stacking
 
Final spam-e-mail-detection
Final  spam-e-mail-detectionFinal  spam-e-mail-detection
Final spam-e-mail-detection
 
Design cycles of pattern recognition
Design cycles of pattern recognitionDesign cycles of pattern recognition
Design cycles of pattern recognition
 
Cross validation
Cross validationCross validation
Cross validation
 
Clustering
ClusteringClustering
Clustering
 
Introduction to recommender systems
Introduction to recommender systemsIntroduction to recommender systems
Introduction to recommender systems
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 

Similar to Assignment 2: Cluster Analysis and Predictive Modelling

IRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET Journal
 
Customer churn classification using machine learning techniques
Customer churn classification using machine learning techniquesCustomer churn classification using machine learning techniques
Customer churn classification using machine learning techniquesSindhujanDhayalan
 
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...IRJET Journal
 
Data Mining on Customer Churn Classification
Data Mining on Customer Churn ClassificationData Mining on Customer Churn Classification
Data Mining on Customer Churn ClassificationKaushik Rajan
 
Automated Feature Selection and Churn Prediction using Deep Learning Models
Automated Feature Selection and Churn Prediction using Deep Learning ModelsAutomated Feature Selection and Churn Prediction using Deep Learning Models
Automated Feature Selection and Churn Prediction using Deep Learning ModelsIRJET Journal
 
Project crm submission sonali
Project crm submission sonaliProject crm submission sonali
Project crm submission sonaliSonali Gupta
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptxAniket Patil
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptxpatilaniket2418
 
IRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
IRJET- Finding Optimal Skyline Product Combinations Under Price PromotionIRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
IRJET- Finding Optimal Skyline Product Combinations Under Price PromotionIRJET Journal
 
Bank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim PredictionBank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim PredictionIRJET Journal
 
Data mining and analysis of customer churn dataset
Data mining and analysis of customer churn datasetData mining and analysis of customer churn dataset
Data mining and analysis of customer churn datasetRohan Choksi
 
Churn customer analysis
Churn customer analysisChurn customer analysis
Churn customer analysisDr.Bechoo Lal
 
Applying Call and Event Detail Records to Customer Segmentation and CLV
Applying Call and Event Detail Records to Customer Segmentation and CLVApplying Call and Event Detail Records to Customer Segmentation and CLV
Applying Call and Event Detail Records to Customer Segmentation and CLVShohin Aheleroff
 
Airtel iCreate National Wildcard Winners 2019
Airtel iCreate National Wildcard Winners 2019Airtel iCreate National Wildcard Winners 2019
Airtel iCreate National Wildcard Winners 2019Naveen Kumar
 
Setanta Systems - Supply Chain Report and Analyses Module
Setanta Systems - Supply Chain Report and Analyses ModuleSetanta Systems - Supply Chain Report and Analyses Module
Setanta Systems - Supply Chain Report and Analyses ModuleSeabrook Technology Group
 
EVALUTION OF CHURN PREDICTING PROCESS USING CUSTOMER BEHAVIOUR PATTERN
EVALUTION OF CHURN PREDICTING PROCESS USING CUSTOMER BEHAVIOUR PATTERNEVALUTION OF CHURN PREDICTING PROCESS USING CUSTOMER BEHAVIOUR PATTERN
EVALUTION OF CHURN PREDICTING PROCESS USING CUSTOMER BEHAVIOUR PATTERNIRJET Journal
 
IRJET- Credit Profile of E-Commerce Customer
IRJET- Credit Profile of E-Commerce CustomerIRJET- Credit Profile of E-Commerce Customer
IRJET- Credit Profile of E-Commerce CustomerIRJET Journal
 
DEMOGRAPHIC DIVISION OF A MART BY APPLYING CLUSTERING TECHNIQUES
DEMOGRAPHIC DIVISION OF A MART BY APPLYING CLUSTERING TECHNIQUESDEMOGRAPHIC DIVISION OF A MART BY APPLYING CLUSTERING TECHNIQUES
DEMOGRAPHIC DIVISION OF A MART BY APPLYING CLUSTERING TECHNIQUESIRJET Journal
 

Similar to Assignment 2: Cluster Analysis and Predictive Modelling (20)

IRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom Industry
 
Customer churn classification using machine learning techniques
Customer churn classification using machine learning techniquesCustomer churn classification using machine learning techniques
Customer churn classification using machine learning techniques
 
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
 
Data Mining on Customer Churn Classification
Data Mining on Customer Churn ClassificationData Mining on Customer Churn Classification
Data Mining on Customer Churn Classification
 
Automated Feature Selection and Churn Prediction using Deep Learning Models
Automated Feature Selection and Churn Prediction using Deep Learning ModelsAutomated Feature Selection and Churn Prediction using Deep Learning Models
Automated Feature Selection and Churn Prediction using Deep Learning Models
 
Project crm submission sonali
Project crm submission sonaliProject crm submission sonali
Project crm submission sonali
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
IRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
IRJET- Finding Optimal Skyline Product Combinations Under Price PromotionIRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
IRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
 
ML_project_ppt.pdf
ML_project_ppt.pdfML_project_ppt.pdf
ML_project_ppt.pdf
 
Clustering
ClusteringClustering
Clustering
 
Bank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim PredictionBank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim Prediction
 
Data mining and analysis of customer churn dataset
Data mining and analysis of customer churn datasetData mining and analysis of customer churn dataset
Data mining and analysis of customer churn dataset
 
Churn customer analysis
Churn customer analysisChurn customer analysis
Churn customer analysis
 
Applying Call and Event Detail Records to Customer Segmentation and CLV
Applying Call and Event Detail Records to Customer Segmentation and CLVApplying Call and Event Detail Records to Customer Segmentation and CLV
Applying Call and Event Detail Records to Customer Segmentation and CLV
 
Airtel iCreate National Wildcard Winners 2019
Airtel iCreate National Wildcard Winners 2019Airtel iCreate National Wildcard Winners 2019
Airtel iCreate National Wildcard Winners 2019
 
Setanta Systems - Supply Chain Report and Analyses Module
Setanta Systems - Supply Chain Report and Analyses ModuleSetanta Systems - Supply Chain Report and Analyses Module
Setanta Systems - Supply Chain Report and Analyses Module
 
EVALUTION OF CHURN PREDICTING PROCESS USING CUSTOMER BEHAVIOUR PATTERN
EVALUTION OF CHURN PREDICTING PROCESS USING CUSTOMER BEHAVIOUR PATTERNEVALUTION OF CHURN PREDICTING PROCESS USING CUSTOMER BEHAVIOUR PATTERN
EVALUTION OF CHURN PREDICTING PROCESS USING CUSTOMER BEHAVIOUR PATTERN
 
IRJET- Credit Profile of E-Commerce Customer
IRJET- Credit Profile of E-Commerce CustomerIRJET- Credit Profile of E-Commerce Customer
IRJET- Credit Profile of E-Commerce Customer
 
DEMOGRAPHIC DIVISION OF A MART BY APPLYING CLUSTERING TECHNIQUES
DEMOGRAPHIC DIVISION OF A MART BY APPLYING CLUSTERING TECHNIQUESDEMOGRAPHIC DIVISION OF A MART BY APPLYING CLUSTERING TECHNIQUES
DEMOGRAPHIC DIVISION OF A MART BY APPLYING CLUSTERING TECHNIQUES
 

More from Siddhanth Chaurasiya

Predictive Modelling & Market-Basket Analysis.
Predictive Modelling & Market-Basket Analysis.Predictive Modelling & Market-Basket Analysis.
Predictive Modelling & Market-Basket Analysis.Siddhanth Chaurasiya
 
Building & Evaluating Predictive model: Supermarket Business Case
Building & Evaluating Predictive model: Supermarket Business CaseBuilding & Evaluating Predictive model: Supermarket Business Case
Building & Evaluating Predictive model: Supermarket Business CaseSiddhanth Chaurasiya
 
Visualization Techniques: Framework, Effective viz & Non-effective viz.
Visualization Techniques: Framework, Effective viz & Non-effective viz.Visualization Techniques: Framework, Effective viz & Non-effective viz.
Visualization Techniques: Framework, Effective viz & Non-effective viz.Siddhanth Chaurasiya
 
Innovation at International Foods Group
Innovation at International Foods GroupInnovation at International Foods Group
Innovation at International Foods GroupSiddhanth Chaurasiya
 
Sustainable reporting and its effects on financial performance.
Sustainable reporting and its effects on financial performance.Sustainable reporting and its effects on financial performance.
Sustainable reporting and its effects on financial performance.Siddhanth Chaurasiya
 

More from Siddhanth Chaurasiya (6)

Predictive Modelling & Market-Basket Analysis.
Predictive Modelling & Market-Basket Analysis.Predictive Modelling & Market-Basket Analysis.
Predictive Modelling & Market-Basket Analysis.
 
Building & Evaluating Predictive model: Supermarket Business Case
Building & Evaluating Predictive model: Supermarket Business CaseBuilding & Evaluating Predictive model: Supermarket Business Case
Building & Evaluating Predictive model: Supermarket Business Case
 
Visualization Techniques: Framework, Effective viz & Non-effective viz.
Visualization Techniques: Framework, Effective viz & Non-effective viz.Visualization Techniques: Framework, Effective viz & Non-effective viz.
Visualization Techniques: Framework, Effective viz & Non-effective viz.
 
Escape Trave: Analytical solution
Escape Trave: Analytical solutionEscape Trave: Analytical solution
Escape Trave: Analytical solution
 
Innovation at International Foods Group
Innovation at International Foods GroupInnovation at International Foods Group
Innovation at International Foods Group
 
Sustainable reporting and its effects on financial performance.
Sustainable reporting and its effects on financial performance.Sustainable reporting and its effects on financial performance.
Sustainable reporting and its effects on financial performance.
 

Recently uploaded

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 

Recently uploaded (20)

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 

Assignment 2: Cluster Analysis and Predictive Modelling

  • 1. Assignment 2: ClusterAnalysisand Predictive Modelling BUS5PA - 19139507 SIDDHANTH CHAURASIYA 19139507
  • 2. 1 | P a g e 1 9 1 3 9 5 0 7 PART A - SEGMENTATION BASED EXPLORATION OF CUSTOMERS --------------------------------------------------------------------------------------------------------------------------- Thissectionof the report containsthe explorationandfindingsfromthe segmentationand clusteringanalysisconductedonCHURN_TELECOMdataset,usingSASMiner.Three typesof segmentation were carriedoutonthe basisof distinctdeterminants –Demographics,Customer Statusand CustomerUsages. Demographics basedProfiling: Aftercreatingthe project, library anddiagram,we add the data-source andsetthe rolesof all the variablesasinputexceptforChurnFlag(Target),Customerandsubscriptionidentifier(ID) and Subscribername (Text). We drag the data-source intothe diagramand connectitwithClusterandSegmentprofilesnodes. Age,GenderandCustomerValue are selectedasthe variablesforthe Clusteraswell asthe Segment Profile node forthisprofilingactivity. CustomerValuecontains68% missingvalues,andthus imputingthose missingvalueswithasyntheticvalue (mean,median,max,etc.) wouldcreate avery skeweddistribution;whichisn’tdesirable.Hence,CustomerValueisn’timputed. Figure 1: Process flow for Demographics based Profiling. Since the measurementscalesof the variablesselectedasthe inputfor Demographical profilingare different,we keepthe methodfor‘Internal Standardization’as ‘Standardization’fromthe properties panel of the Clusternode. Restall propertiesof nodesClusterandSegmentProfile are keptat default. Figure 2: Cluster and Segment Profile results for Demographical segmentation. We founda goodcombinationof clusterswithfairamountof observationsineachsegment(Figure 2) aftersettingthe numberof clustersas4. The four segmentscouldbe broadlyclassifiedas:
  • 3. 2 | P a g e 1 9 1 3 9 5 0 7 Cluster 1 – ValuableYoung Adults. Thissegmentcanbe describedasa groupof Maleswhoare justabout start theirprofessional careersand generate highcustomervalue forthe organization. Since thisclustershow the tendency of highcustomervalue,the companyshould ensure retentionof thissegment. Cluster 2 – Distressed Damsel. Thisclustercan be bestexpressedasa segment of juvenileFemaleswhoaccumulateforarelatively lowerCustomerValue. Thissegmentaccountsforlowercustomervalue,whichmaybe anindicator that customersaren’tsatisfiedwiththe servicesofferedandmaychurninthe future.The company shoulddevise plans,offersanddiscountstonegate the chancesof churnof thiscluster. Figure 3: Results from Segment Profile node. Cluster 3 – Stingy Seniors. Thisgroup ischaracterizedbyseniormales whogenerate low valueforthe Telecomcompany. As such,customersbelongingtothis segmentmayneedspecial attentionsastheyhave highlikelihood of churning, asindicatedbytheirlowcustomervalue generation. Cluster 4 – Bankableladies. Thisclusteris classifiedbyelderwomenwhoproduce highvalueforthe company. The company shouldlooktomaximize the value derivedfromthissegment.
  • 4. 3 | P a g e 1 9 1 3 9 5 0 7 Figure 4: Variable significance for each cluster. As observedfromFigure 4, Genderwasthe mostinfluential variable forthe classificationof DistressedDamsel,StingySeniorsandBankable ladieswhileAge hasthe mostsignificance for Valuable YoungAdults. Note:The variable CustomerValuewasonlycollectedforcustomerswho were identifiedashaving highprobabilityof churning.Customervalue wasn’tcollectedforcustomerswhohadlow probability of churning.Assuch,these leadstoa distortedanalysisforcluster.However,since we don’thave sufficientdemographical variables,we stilluse CustomerValue forthe clustering. CustomerStatus basedProfiling: To conduct CustomerStatusbasedsegmentation,we optforvariables whichhighlightwhatthe statusof the customeriswithreference tothe servicesofferedbythe company. Email queriessent, revenue throughGPRS,internet,&fix-lineanddayssince lastcomplain are the variableswhichare selected. ThroughStatExplore we foundoutthe distributionof the latterfourselected variables were highlyskewed, andthus we normalizethemusingTransformvariablesnode. Figure 5: Process flow for Customer Status based Profiling. Settingupof 4 clustersledtoan excellentcreationof fairlyequalsegments. The fourclusterscould be interpreted as: Cluster 1 – Superactive Thisclusterischaracterizedbycustomerswhotendto conversate backand forthwiththe company throughemailsquite oftenbuthaven’treallyhadacomplaintregardingthe servicesrecently. Additionally,these customersgenerate arelativelyhigherrevenue throughinternet,GPRSaswell as fix-line services.As such,the customersfrom these segments are very importantfromprofitability
  • 5. 4 | P a g e 1 9 1 3 9 5 0 7 pointof view. Cluster 2 – Curious Customersfromthisclustercanbe describedasbeing quite curiousaboutthe new plans,asevident fromtheirhighnumberof email queriessentinthe past6 months.Similarly,theyhave lodgeda complaintveryrecentlyandproduce ahighrevenue throughthe internetmediumforthe company. Thus,theyhave beenaptlynamedas‘Curious’. Thiswill needspecialattentionfromthe organization,asitshowssignsof churning, Figure 6: Results from Segment Profile node. Cluster 3 – Content Customersbelongingtothissegmenthave rathersatisfiedwiththe servicesandhave laidback attitude.These customersdon’tgenerallysendinemailqueriesandhaven’tmade acomplaintwith the companyrecently.The cashinflowgeneratedbythese particulars customersisidentical tothe overall distributionof the customersacrossthe whole dataset. Cluster 4 –Transitionals ‘Transitionals’representsaclusterof customerswhotransitioningtothe modernservicesofferedby the company.Theyhave made a complaintfairlyrecentlybutdon’tgenerallysendmuchemailsto the organization.The revenue generatedthroughinternetbythemisonthe lowerside butthey produce highrevenue throughfix-linesandGPRS. Days since lastcomplaintwasverysignificantvariablesforclusters‘SuperActive’ and‘Transitionals’, while emailsquerieswerestrongdeterminantsforvariables‘Curious’and‘Content’(Appendix - Figure 11).
  • 6. 5 | P a g e 1 9 1 3 9 5 0 7 Usage based Profiling: To conduct usage basedprofiling,we selectvariableswhichhighlightusage pattern –outgoing national,international,roaming&local calls,change inbill andrevenue throughinternet andfixline. Since these variableswere highlyskewed,we usedtransformvariablestonormalize their distribution. Figure 7: Process flow for Customer Status based Profiling. Since we convertedall the variablesinlog,we setthe ‘Standardization’tonone.We setthe number of clusterto4. The resultswere interpretedas: Cluster 1 – Cosmopolitan Thisclusterischaracterizedbycustomers whohave a highusage of outgoinginternational calls.Rest of the usageslike national calls,local calls,roamingcallsandinternetforthissegmentissame asthe patternof customersacrossthe dataset.Assuch, the companyshouldoffercustomersfrom this clusterplanswhichmore attractive forinternational calling,if theywanttoretaintheminlong-run. Cluster 2 – Connected Customersfromthisclustertendtohave a highusage of outgoingcallsat national level.Theirusage of otherservicesis prettymuchsimilartothe overall usage patternof the customers.Churning customersfromthissegmentcanbe luredback by offeringthemvalue-for-moneyplansfornational calling.
  • 7. 6 | P a g e 1 9 1 3 9 5 0 7 Figure 8: Results from Segment Profile node. Cluster 3 - Traditionals Thisclusterisdescribedas‘Traditionals’sincetheirusage patternstaysthe same throughout,as evidentfromtheirlowpercentage change inbills.Theirutilizationof nationalandinternational calls stayson the lowerside thoughtheyuse ahighamount of internet. Cluster 4 –Modern Thissegmentisdescribedas‘Modern’asit ischaracterizedbythe usage of contemporarycustomers – fluctuatingbills,low usage of calls(national,local,international &roaming) andhighinternet usage.
  • 8. 7 | P a g e 1 9 1 3 9 5 0 7 Figure 9: Variables' influence on each cluster. Cross-clusteranalysis: AftercreatingrespectiveclustersbasedonDemographics,CustomerStatusandUsage,we conducta cross-clusteranalysistoexplore if there’s anyassociationbetweenthese segments;whichcould potentiallybe harnessedintosomethingprofitable forthe company. We addthe Save data node tothe Clusternodesandexportthe datafor segmentfromall three categories.UsingVLOOKUPfunctioninexcel,we arrivedatthe followingobservation: Demographic ValuableYoungAdults Distressed Damsel StingySeniors Bankableladies Usage Cosmopolitan 28.79% 23.51% 21.85% 22.92% Connected 22.06% 29.47% 23.69% 25.45% Traditionals 25.63% 22.29% 27.66% 35.29% Modern 24.59% 25.78% 25.23% 21.02% Cross-cluster analysis:Demographic vs Usage It was seen ‘Valuable YoungAdults’and ‘Cosmopolitan’sharedagoodassociation,indicatingthat youngmentendto use international callingfrequently.Similarly,itwasobservedwomeninlatter stages(‘Bankable ladies’) hadavery‘Traditional’usage i.e.theirbillsrarelyfluctuatedandtheyused the callingfeaturesasmuchas the overall average. Lastly,DistressedDamselwascloselyrelatedto ‘Connected’,whichmeanstheyare prettyactive intermsof outgoingscallsnationally. Theseinsights can be usedveryeffectivelytogaincompetitiveadvantage andimprove the offeringstothe respective customers.A lotof businessvaluecanbe derivedbycorrectinterpretationandproper actionsoverthem. Cross-clustershighlightedinredrepresentthe group whichhave a high-chance of churning(derived usingChurnFlagvariable andVLOOKUP).Assuch,it isimperative thatcompanyofferssuch customersgooddiscountsandplans dependingupontheirusages soontoretainthem.
  • 9. 8 | P a g e 1 9 1 3 9 5 0 7 Cross-clusteranalysisbetweenCustomerStatusandUsage helpedustodiscoversome hidden insights. Customer Status Super active Curious Content Transitionals Usage Cosmopolitan 22.37% 23.67% 26.39% 24.23% Connected 33.42% 27.43% 22.13% 25.16% Traditionals 23.92% 23.54% 30.03% 21.37% Modern 25.41% 29.83% 24.22% 27.55% Cross-cluster analysis: Customer Status vs Usage ‘SuperActive’customerstendtobe involvedinalotof interactioninternationally(‘Cosmopolitan’), while ‘Curious’customershadaverymodern-like usage.Similarly,customerswhohave been categorizedas‘Content’hada lotincommon with‘Traditional’usages.Assuch,the companies shouldkeepthese insightsinmindandprepare planstomaximize profitoutof such groupof customers. On the otherhand,customersbelongingto‘SuperActive’clusterwithusage of ‘Traditional’have a highprobabilityof churning.Additionally,‘Transitionals’withhighinternationalcallingusage and national callingusage mayleave the companysoonersratherthanlater. Thus,the companyneedsto dishout offersanddiscountsaccordingly,basedonthe usage patternsasmentionedabove,to retainthose customers. PART B – EXTENDING KNOWLEDGE OF PREDICTIVE ANALYTICS --------------------------------------------------------------------------------------------------------------------------- Sevenreasonsfor Predictive Analytics Since the turn of the millennium, andespeciallyinthe lastdecade orso,there has an unprecedented generationof data,whetherstructuredorunstructured.Infact,IBMhas statedthat suchlarge of volumesof dataisgeneratedevery day thatthe amountof data doublesupeverytwoyears. In thisgiganticamountof data liesnumeroushiddenpatternsandtrends,whichif harnessedinthe rightmannercouldresultinbusinessvalue of epicproportionsforthe organization. Predictive analyticsisone suchtool thatcan exploitthese giganticdatato conjure upwithmeaningful and actionable insights. Eric Siegel,anaccomplishedheavyweightinthe fieldof PredictiveAnalytics,putforwardhis thoughtsonPredictive Analyticsinawhite paper,statingpreciselywhythe worldneedstoembrace Predictive Analytics. AsperEricSiegel,adoption,implementationandapplicationof Predictive Analyticscanenable anorganizationtoachieve the followingsevenobjectives -  Compete:Gaincompetitive advantageoverrivals.  Grow: Increase sales,expandcustomerbase andretainexistingcustomers.  Enforce:Detectfrauds,anomalies andundesirablecircumstances.
  • 10. 9 | P a g e 1 9 1 3 9 5 0 7  Improve:Enhancement&refinementin core productofferings,processautomationand resourcesoptimization.  Satisfy:Provide tailoredsolutions andrecommendationsforcustomers.  Learn: Learningfromthe pastdata (structuredas well asunstructured) toprovide insights and foresightsaboutthe future.  Act: Actionable recommendations &insights. Case Study II – Predictive Analyticsfor Insurers Insurance company’s operatingsuccess chiefly reliesonitsforecastingcapabilities.The primary distinguisherbetweenthe bestandthe restof the insurance companies isthe accuracy withwhich the organization cantarget the potential customers,setthe pricingof the premiumanddetect fraudulentclaims.Muchof these taskswere carriedouton the basisof guestimatesinthe olden days;a methodwhichwasn’treallyefficientorcost-effective. Soon,keydeterminantslike age andhistorybecame the foundationonwhichinsurance companies forecastedits operations.However,today, Predictive Analyticshaschangedthe entire landscape of howinsurance companiesconductedtheiroperations. Withthe helpof PredictiveAnalytics,insurance companieshave notonlybeenabletoimprove their core operations(e.g. Creditscores,frauddetection)butalsomarketingof the product(basedupon buyingpatternsi.e.hitratio,retentionratio) andunderwriting(filteringoutcustomerswhodonot meeta givencriteria,therebysavingtime andmoney). RelatingCase Study II to sevenreasons for Predictive Analytics The applicationof PredictiveAnalyticsisveryprevalentinthe insurance landscape;andisinfact consideredasindustry bestpractise.The businessvalue thatcanbe derivedfromutilizationof Predictive Analyticsinthe fieldof Insurance is tremendous. Afterthoroughlyanalysingthe given Insurance case study,we couldsummarize how usage of PredictiveAnalyticsbyInsurance firms enabledthemtoachieve the outcomesdescribedbyEricSiegel as: Compete: Insurance industryis verycompetitive,withcompaniesalwaysiteratingtostayone stepaheadof the rivals.PredictiveAnalyticscanenable anorganizationtogatherknowledge aboutthe customers ina more holisticmanner,whichcancreate a competitiveadvantage forthe firm.Similarly, Predictive Analyticscancreate creditscore rating models,adverseselectionmodelsandsoon,which will aidthe organizationtostayaheadof theirrivals. Grow: The insurance industryhaswitnessedsnail-pacedgrowthoverthe pastfew years.Thishasledto organizationsexploringthe optionstoexpandtonew horizonsandlocations.Withthe helpof Predictive Analytics,insurancecompaniescanpredictthe whichcustomersare likelytorespondto offersandmarketingcampaigns. Similarly,throughPredictive Analytics,canunderstandthe buying pattern,whichcan be usedformarketing’shitratioandcustomerretentionratios. Enforce: One of the mostsignificantfunctionforanyinsurance companiesis detectionof fraudulentclaims.
  • 11. 10 | P a g e 1 9 1 3 9 5 0 7 Withthe helpof scoringandrankingmodels,Predictive Analyticscanhighlightwhichclaimsare a bit suspiciousandneedmore investigationbefore settlement. Improve: Predictive Analyticscanimmenselyaidthe operating efficiencyandproductofferingof aninsurance company. Throughpredictive models,insurerscanidentifyatthe initial stage itselfwhichclaimsare likelytobe settledforhighvalue inthe future. Thiswill allow the companytorunits operations more efficientlyandinamore economical manner.Additionally,Predictive Analyticscanfindout whichcustomersmeetthe stipulatedobligationsforthe insurance andwhichcustomersdonot.This helpsinsavingtime,moneyandresources of the organization. Satisfy: To maximize the customervalue,insurersneedtopitchthe righttype of insurance (lifeinsurance, vehicle insuranceandsoon) to the customer. By observingthe buyingpatternsof the customers, Predictive modelscansuggestthe rightfitof insurance individuallyforeachcustomer.Similarly, Predictive modelscanassignariskscore foreach customerdependinguponvariousdeterminants (age,location,history,etc.).These scoresthenenable the companytosetappropriate premium pricingforthe customersaccordingly. Learn: Predictive Analyticsusessophisticatedmodelstofindoutpatternsandtrendsinthe dataset.As such,usage of Predictive modelslike Linearregression,logitregression,decisiontreesandsoon can enable the insurers tofindif anypatternexistsbetweenthe variables.Thisinformationcanbe used for variousoperational activities. Act: The insightsandforesightsgeneratedbythe Predictive modelscanaddgreatbusinessvalue if they are implementedbythe organization.Insurershave beenproactivelyactingonthe insights producedbyPredictive Analytics.Frauddetection,customerretention,churnanalysis,adverse selectionare some of the modelsthathave beencreatedthroughPredictivemodellingandbeen actedupon bythe insurance companies. Commenton sevenreasons for Predictive Analyticsand its relationwith Churn Case Study The sevenreasonsof Predictive AnalyticsstatedbyEricSiegel addsdefinitevalue toPredictive Analyticsproject.The steps mentionedby‘Dr.Data’ are comprehensiveanddescribe the benefits that couldbe derivedfromaPredictive modelata veryminute level. From the above Case StudyaboutInsurance,we couldobserve andrelate areal-life applicationof the sevenreasonsforPredictiveAnalyticsandhow itprovedadvantageoustothe industry. The sevenreasonsforPredictive Analyticscanalsobe witnessedinChurnCase study inparts.The churn analysisenablesthe Telecomcompany togaincompetitiveadvantage (‘Compete’) overits rivalsas itcouldact uponthe highchurn customersandretainthem (‘Grow’) byofferingthemoffers and discounts (‘Act’) while theircompetitors whodon’tuse PredictiveAnalytics won’tbe able to retaintheirhighchurningcustomers The DecisionTree andRegression modelswere builtusingpastdata(‘Learn’).The DecisionTree
  • 12. 11 | P a g e 1 9 1 3 9 5 0 7 model wasthenused onthe new datasetwiththe helpof Score node todetectwhichcustomersare on the verge of churning(‘Enforce’). Eventhoughthe model flagscustomershavinghighprobabilityof churn,the case studydoesn’t reallyfollow ‘Improve’ asthe model doesn’tenhance the core productofferingbutjustindicate whichcustomersmaybe unhappywiththe services.Similarly,the case studydoesn’tfollowthe ‘Satisfy’ asit cannotsuggesttailoredsolutionstoindividual customersbutcan onlysuggestwhich customersshouldbe offeredadiscounttoretainthem. SEMMA SEMMA (Sample,Explore,Modify,Model andAssess) isa methodologyformulatedby SASinstitute, to conductany data miningtasksonits software, SASEnterprise Miner. SEMMA isconcernedwith the model developmentaspectsof data-mininginSASMiner,anditsadherence ensuresend-to-end coverage of the core data miningprocesses;whichdirectlyleadstomore informedandaccurate analysis. However,due tolackof concrete approachestowardsdatamining processflow (otherthanCRISP- DM), SEMMA isfollowedbymanyanalyststoconductdata miningactivities. SEMMA standsfor- Sample: Everydata miningactivityshouldstartwithsamplingof the datasetintotraining,validation and testsets,ensuringthere’senoughinformationtocarry all these tasks. Explore:In thisstage,we investigateandexplorethe variablestodiscoverinformationandpatterns that may existbetweenthe variables. Modify:Atthisstage,we selectappropriate methodstomodify,transformandrectifyvariablesthat wouldbe usedinthe modelling. Model: Afterexplorationandmodificationof variables,we applythe modelling technique onthe selectedvariables. Assess: At the lastphase,we evaluate the accuracyand predictingcapabilitiesof the models. Relationto ChurnCase Study SASproposedthatSEMMA isthe core processof conductinga data miningactivity. Itcanbe observed fromFigure 10, the churn case studyreligiouslyfollowedthe SEMMA principles. The Churn analysiscommenceswith DataPartitionnode (Sample),whichenablesustocreate sample fromthe datasetand allocate sufficientenoughdatafortraining,validationandtest. Thisisthenfollowedby imputationof missingvaluesandreplacementof variabletoreduce its numberof classes(Modify). To reduce the redundancy,we utilizethe Variable Clusteringnode (Explore) andthenrunour DecisionTree andRegressionmodels(Model).ThroughModel Comparison(Assess),we compare the twopredictive modelsandfindoutsomethingpeculiar.Toinvestigateitfurther,we use Multiplot node (Explore) anddetectabnormal variableswhichaffectedthe predictivecapabilitiesof the model.
  • 13. 12 | P a g e 1 9 1 3 9 5 0 7 Figure 10: Process flow of Churn Case Study Usingmetadata(Modify),we remove theseabnormal variablesandre-connectthe Decisiontree and Regressionmodels(Model) toit. Then,we againuse the Model Comparisonnode (Assess) togauge whichmodel outperformsthe other.Finally,we use the Score node (Assess)toapplythe bestmodel to the newdatasetand complete the dataminingprocess. Thus, it couldbe concludedthatall the stepsof SEMMA were comprehensivelycoveredbythe Churncase study. The adherence of SEMMA inthe Churn Case studycan be summarizedas: Steps Nodes Sample Data Partition Explore Multiplot, Variable Clustering Modify Impute, Replacement, Metadata Model Decision Tree, Regression Assess Model Comparison, Score Relating SEMMA with Churn Case Study. Importance of SEMMA EventhoughSASinsistsSEMMA is merelyasetof guidelinestobe followedforSASminer,the methodology’sapplicationcanbe extendedtodataminingtasksasa whole.SEMMA is a veryrobust approach thatencompassesall the chief criteriarequired forundertakingorbuildingacomplex predictive model.Adherence of SEMMA ensuresease of processflow,detectionof faultsand creationof more accurate models. ChurnCase studyhugelybenefittedbyfollowingthe SEMMA methodology. ThroughMultiplotand Variable Clusteringwe could ‘explore’ erroneousvariablesandredundantvariablesandthrough impute,replacementandmetadata,we could ‘modify’ suchvariables.Model Comparison enabledus to compare,contrastand ‘assess’ the twopredictive ‘models’ –DecisionTree andRegression.With the helpof Score node, we evaluatedandappliedthe model toanew dataset. Sample Samp le Modify y Samp le Sample Model Samp le Explore Samp le Assess y Sam ple Sample
  • 14. 13 | P a g e 1 9 1 3 9 5 0 7 Appendix Figure 11: Most significant variables for each of the four clusters.