2. ExecutiveSummary Data miningprojects , particularlyon HR issues, tend to beseen as being expensive highriskdue to uncertaintyofbenefitsorreturn onlyviableifsupportedbyhighlevel HRMS or “Data warehouse” and applied to verylargeorganizations. Thefollowing case showedthat a lotcanbedonewithnoneofthoserequirements, withimmediatbenefits, altoughrequiring a goodknowledgeofthesubject (organizationalbehaviour & HR Mgmt) and some knowledgeonthetechnologybehind data miningandanalyticsin general. The Case: A minimum data collection, onanorganizationofaprox 1500 employees, wasenough to identify some ofthekeyfactorsthat stand behindemployeeturnover, exposingspecificareas for immediateintervention, allowinganestimatedreductiononaprox 50% oftheoccurringandundesiredturnover. Increasedinvestmentsandtimecanobviouslyimproveresults. Data mining on Employee turnover data - Paulo Xavier - August 2009
3. Data collectedandanalysisprocedure Onlyconsideredexitsduringonespecificyear (with a particularlyhighnumberofexits). Thiswas a yearwithhighlabourmarketactivity . Appartfrominformationof general interest (Age, Seniority, genderetc) itwasmostly data concerningtheemployeesCompensationand Performance thatwasincluded. Final data to betreatedincluded1580 recordsand34variablesfromwhichonlyhalfproved to berelevant. Severalalgoritmsweretriedbutthesimplestapproachprovedthemostreliableandproducedthemostinterestingresults. Trainingofthemodelwasdoneon a sampleof 70% ofthe data andtestingontheremaining 30%. Fromseveralinitialteststhosevariablesthatprovedirrelevantweregraduallyexcluded. The final version, whoseresults are presented, isnottheonethatproved more accurateinpredictingturnover, Howeverit´stheonethatincludesvariablesofrelevance for futureactions/decisionsand for thatreasonconsideredthe final one. Inthis case, more thenaccuracyinprediction , thecapacity to turnconclusionsinactionabledecisionswasconsideredrelevant. All final conclusions are translatableinspecificactions/decisionsthatinfluenceturnover. Impactisestimatedinbeingable to reduceturnoverinaprox 50% versus previoussituation. Data mining on Employee turnover data - Paulo Xavier - August 2009
4. DecisionTreeandrules Thefollowingdecisiontreeandnext slide withrules are the final resultfromthisexercise. Theyidentifyareaswerecompensationadjustments are necessary to reassurethereductionofturnover. Nonrewardissues are obviouslyalsorelevant to justifyturnover, butinthis case a verylimitedsetofvariableswasenough to obtaintangibleandusefullresults. Minimumexitprobability = 0 Maximum = 1 Undereachrulenumber, thelikelihoodofexit Ex: Rule127lkelihoodofexit = 0,8148 (for 27 observationsinthetestsample) Data mining on Employee turnover data - Paulo Xavier - August 2009