SlideShare a Scribd company logo
1 of 13
Rui Cao
Objective:
Use SASEnterprise Minertoidentifypotentialrelationshipsbetweenmortgage defaultsandother
mortgage transactioninformation. Withaddingprincipal componentsandclusterfunctions,we can
reach a bettermodel thatcan predictfuture mortgage risk.
Dataset:
loandataHW_cleaned.csv
variable definitionloandataHW.docx
4.Screenshotof the diagram
5.Explanationof selectionof variables:
For V16 to V20, afterrunninganalysisinTableau, I saw norelationbetweenthemandthe target
Afterrunningthe remainingvariables,the resultsshowedmanyof the coefficientsare notsignificant, so
I keptv1 v12 v14 and v9 at the end,andtheyare all significant.
6.Base Model
Before runningthe base model,Iaddeddatapartitionandfilter tosplitthe dataand take out the
extreme valuestohelpreacha bettermodel result.
(1) The resultshowsAICis212959.99 andmisclassificationrate is0.04 with6 parameterestimates. In
fact, the base model hasthe lowestAICamongothermodelswhichare showninthe followingpages.
(2) V14 B and V14 V representBrokerandCorrespondent.The coefficientof Brokeris0.0434 meaningif
the mortgage is soldbya broker,the possibilityof beingriskyincreases0.0434. The coefficientof
Correspondentis -0.0346 meaningif the mortgage issoldbya correspondent,the possibilityof being
riskydecreases0.0346.
7. Principal Componentmodel
(1) I have tried differentpercentage of variationsforthe principal componentfrom0.8to 0.9. The
higherthe percentage,the more accurate the model couldbe;however,the more principal components
may have thusmakingitharder to explaineachprincipalcomponent,andvice versa.Asaresult,I
decidedtouse 0.85 since the itledto the lowestAIC,same misclassificationrate andfewestparameter
estimates.
(2) All of the parametersare significant.The AICof the model is213075 slightlyhigherthanthe base
model’s.Misclassificationrate is0.04. Parameterestimatesis5.
(3) The firstPC explainsthe spanof the mostvariationof the variables.Afterexaminingthe table of each
variable’sinfluence below,PC_1couldbe labelledasV12, PC_2 as V14_3, PC_3 as V14_1, PC_4 as V1,
PC_5 as V9 and PC_6 as V14_2. The coefficientof PC_1equalsthe sumof eachvariable’sreads
multipliedbytheirinfluence.Afterlabeling,we cansayone unitof increase inv12 (Original loan-to-
value),the possibilityof beingariskymortgage increases7.7462.
(4) Yes,since we can explaineachPCanduse lessparameterestimates,itcouldbe abettermodel.
8.
(1) For Clustering,Ileft the minimumnumberof clustersasdefault2,since the software can
recommenditself.Asaresult,8 clustersshowedupwiththe centroidmethod.
(2) The model resultshowednotall the parametersare significant.AICis473548 and the
misclassificationrate is0.07. Parameterestimatesis9.
(3) The meaningof a clustershowshowa clusterisdifferentfromothers.Inthiscase forcluster1, ithas
the highestaverage numberof v14_R being1. So,we can name the firstclusteras v14_R eventhough
the nexthighestis0.989. Interms of the meaningof the coefficient,if the mortgage issoldbyretail,the
possibilityof beingariskymortgage decreases8.7379.
(4) Yes, eventhoughIfounda highestaverage numberforcluster1,it isnot significantly highcompared
to otherclusters,sothismay notbe veryuseful.
9. In the end,eventhoughthe software selectedthe base model asthe best.Idecidedtouse the model
withthe principal componentwith85%of variations.The reasonisthatwiththe same misclassification
rate andsimilarAIC,the parameterestimatesforPC85% isthe lowestasit is5. Also,fromthe
eigenvectortable,we canlabel eachPCa variable thusmakingiteasiertounderstand.
10.
1 The SAS System Thursday,February14, 2019 05:37:00 PM
WARNING:Your systemisscheduledtoexpire onMarch 31, 2019, whichis45 daysfromnow.The SAS
Systemwill nolongerfunctionon
or afterthat date.Please contactyourSAS InstallationRepresentative toobtainyourupdatedSAS
InstallationData(SID)
file,whichincludesSETINITinformation.
To locate the name of yourSAS InstallationRepresentativegotohttp://support.sas.com/repfinderand
provide yoursite number
70094220 and companyname as SASONDEMANDFOR ACADEMICS.On the SASREP listprovided,locate
the REP for operatingsystemLIN X64.
You are runningSAS9. Some SAS8 fileswillbe automaticallyconverted
by the V9 engine;othersare incompatible. Pleasesee
http://support.sas.com/rnd/migration/planning/platform/64bit.html
PROCMIGRATE will preservecurrentSASfile attributesandis
recommendedforconvertingall yourSASlibrariesfromany
SAS8 release toSAS9. Fordetailsandexamples,pleasesee
http://support.sas.com/rnd/migration/index.html
Thismessage iscontainedinthe SASnewsfile,andispresentedupon
initialization. Editthe file "news"inthe "misc/base"directoryto
displaysite-specificnewsandinformationinthe programlog.
The command line option "-nonews"willpreventthisdisplay.
1 filename_emenvcatalog'sashelp.emwip.em_loadmacros.source';
2 %inc_emenv;
1088 filename _emenv;
1089 %letWIP_PROJPATH=%nrstr(~);
1090 %letWIP_PROJNAME=%nrstr(ACC637);
1091 proc displayc=sashelp.emwip.em_init.scl;run;
7618 %letSYSCC=0;
7619 optionsVBUFSIZE=64M;
2 The SAS System
Thursday,February14, 2019 05:37:00 P
M
7621 %letSYSCC=0;
7622 %letSYSRC=0;
7623 %letEMEXCEPTIONSTRING=;
7624 %letSYSMSG=;
7625 %em_diagram(action=open,projpath=%nrstr(~),projname=%nrstr(ACC637),dgmId=EMWS2,
userId=caoruidpu,sessionid=3aedbb88-3b33-4e67-87a9-34bba363d59a,
outfile=DiagramOpenResponse.xml);
WIP_ACTION:
DGMID: EMWS2
LOCKFILE:~/ACC637/Workspaces/EMWS2/System/wsopen.lck
7637 %let_EM_TREECONVERSION=0;
7638 data _null_;
7639 setEMWS2.EM_NODEID end=eof;
7640 where upcase(Component)='DECISIONTREE'andCLASS=
'SASHELP.EMMODL.DECISIONTREE.CLASS';
7641 if eof thencall symput('_EM_TREECONVERSION','1');
7642 run;
treeconversion=0
7643 %letsyscc=0;
7644 filename _wipchkcatalog"EMWS2.Assoc.test.source";
7645 data _null_;
7646 file _wipchk;
7647 put '/* Test */';
7648 run;
7649 data _null_;
7650 rc = fdelete('_wipchk');
7651 run;
7652 filename _wipchk;
7653 filename _wipxml'/saswork/SAS_work687C000007DF_odaws01-prod-
us/SAS_workDCD4000007DF_odaws01-prod-us/DiagramOpenResponse.xml'encoding="UTF-8"
NOBOM;
3 The SAS System
Thursday,February14, 2019 05:37:00 P
M
WARNING:End of file.
WARNING:End of file.
7654 %letSYSCC=0;
7655 %letSYSRC=0;
7656 %letEMEXCEPTIONSTRING=;
7657 %letSYSMSG=;
7658 %em_diagram(action=SETORIENTATION,projpath=%nrstr(~),projname=%nrstr(ACC637),
dgmId=EMWS2, sessionid=3aedbb88-3b33-4e67-87a9-34bba363d59a, orientation=HORIZONTAL);
WIP_ACTION:OPEN
DGMID: EMWS2
LOCKFILE:~/ACC637/Workspaces/EMWS2/System/wsopen.lck
4 The SAS System
Thursday,February14, 2019 05:37:00 P
M
WARNING:End of file.
WARNING:End of file.

More Related Content

Similar to Bank of pecunia mortgage risk model

Python tutorial for ML
Python tutorial for MLPython tutorial for ML
Python tutorial for MLBin Han
 
Configuration Optimization for Big Data Software
Configuration Optimization for Big Data SoftwareConfiguration Optimization for Big Data Software
Configuration Optimization for Big Data SoftwarePooyan Jamshidi
 
BIG MART SALES PREDICTION USING MACHINE LEARNING
BIG MART SALES PREDICTION USING MACHINE LEARNINGBIG MART SALES PREDICTION USING MACHINE LEARNING
BIG MART SALES PREDICTION USING MACHINE LEARNINGIRJET Journal
 
Advanced Pricing in General Insurance
Advanced Pricing in General InsuranceAdvanced Pricing in General Insurance
Advanced Pricing in General InsuranceSyed Danish Ali
 
A machine learning model for average fuel consumption in heavy vehicles
A machine learning model for average fuel consumption in heavy vehiclesA machine learning model for average fuel consumption in heavy vehicles
A machine learning model for average fuel consumption in heavy vehiclesVenkat Projects
 
The Beginnings Of A Search Engine
The Beginnings Of A Search EngineThe Beginnings Of A Search Engine
The Beginnings Of A Search EngineVirenKhandal
 
The Beginnings of a Search Engine
The Beginnings of a Search EngineThe Beginnings of a Search Engine
The Beginnings of a Search EngineVirenKhandal
 
Adapted Branch-and-Bound Algorithm Using SVM With Model Selection
Adapted Branch-and-Bound Algorithm Using SVM With Model SelectionAdapted Branch-and-Bound Algorithm Using SVM With Model Selection
Adapted Branch-and-Bound Algorithm Using SVM With Model SelectionIJECEIAES
 
Caravan insurance data mining prediction models
Caravan insurance data mining prediction modelsCaravan insurance data mining prediction models
Caravan insurance data mining prediction modelsMuthu Kumaar Thangavelu
 
Caravan insurance data mining prediction models
Caravan insurance data mining prediction modelsCaravan insurance data mining prediction models
Caravan insurance data mining prediction modelsMuthu Kumaar Thangavelu
 
Churn Analysis in Telecom Industry
Churn Analysis in Telecom IndustryChurn Analysis in Telecom Industry
Churn Analysis in Telecom IndustrySatyam Barsaiyan
 
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...CA Technologies
 
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...CA Technologies
 
Data Mining Techniques using WEKA_Saurabh Singh_10BM60082
Data Mining Techniques using WEKA_Saurabh Singh_10BM60082Data Mining Techniques using WEKA_Saurabh Singh_10BM60082
Data Mining Techniques using WEKA_Saurabh Singh_10BM60082Saurabh Singh
 
Exploring Microoptimizations Using Tizen Code as an Example
Exploring Microoptimizations Using Tizen Code as an ExampleExploring Microoptimizations Using Tizen Code as an Example
Exploring Microoptimizations Using Tizen Code as an ExamplePVS-Studio
 
Simulation Project
Simulation ProjectSimulation Project
Simulation Projectshri1984
 

Similar to Bank of pecunia mortgage risk model (20)

Python tutorial for ML
Python tutorial for MLPython tutorial for ML
Python tutorial for ML
 
Configuration Optimization for Big Data Software
Configuration Optimization for Big Data SoftwareConfiguration Optimization for Big Data Software
Configuration Optimization for Big Data Software
 
BIG MART SALES PREDICTION USING MACHINE LEARNING
BIG MART SALES PREDICTION USING MACHINE LEARNINGBIG MART SALES PREDICTION USING MACHINE LEARNING
BIG MART SALES PREDICTION USING MACHINE LEARNING
 
Advanced Pricing in General Insurance
Advanced Pricing in General InsuranceAdvanced Pricing in General Insurance
Advanced Pricing in General Insurance
 
A machine learning model for average fuel consumption in heavy vehicles
A machine learning model for average fuel consumption in heavy vehiclesA machine learning model for average fuel consumption in heavy vehicles
A machine learning model for average fuel consumption in heavy vehicles
 
The Beginnings Of A Search Engine
The Beginnings Of A Search EngineThe Beginnings Of A Search Engine
The Beginnings Of A Search Engine
 
The Beginnings of a Search Engine
The Beginnings of a Search EngineThe Beginnings of a Search Engine
The Beginnings of a Search Engine
 
APT_&_VaR[1]
APT_&_VaR[1]APT_&_VaR[1]
APT_&_VaR[1]
 
Chapter 18,19
Chapter 18,19Chapter 18,19
Chapter 18,19
 
Adapted Branch-and-Bound Algorithm Using SVM With Model Selection
Adapted Branch-and-Bound Algorithm Using SVM With Model SelectionAdapted Branch-and-Bound Algorithm Using SVM With Model Selection
Adapted Branch-and-Bound Algorithm Using SVM With Model Selection
 
Caravan insurance data mining prediction models
Caravan insurance data mining prediction modelsCaravan insurance data mining prediction models
Caravan insurance data mining prediction models
 
Caravan insurance data mining prediction models
Caravan insurance data mining prediction modelsCaravan insurance data mining prediction models
Caravan insurance data mining prediction models
 
Ijciet 10 01_162
Ijciet 10 01_162Ijciet 10 01_162
Ijciet 10 01_162
 
Churn Analysis in Telecom Industry
Churn Analysis in Telecom IndustryChurn Analysis in Telecom Industry
Churn Analysis in Telecom Industry
 
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
 
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
 
Classification Problem with KNN
Classification Problem with KNNClassification Problem with KNN
Classification Problem with KNN
 
Data Mining Techniques using WEKA_Saurabh Singh_10BM60082
Data Mining Techniques using WEKA_Saurabh Singh_10BM60082Data Mining Techniques using WEKA_Saurabh Singh_10BM60082
Data Mining Techniques using WEKA_Saurabh Singh_10BM60082
 
Exploring Microoptimizations Using Tizen Code as an Example
Exploring Microoptimizations Using Tizen Code as an ExampleExploring Microoptimizations Using Tizen Code as an Example
Exploring Microoptimizations Using Tizen Code as an Example
 
Simulation Project
Simulation ProjectSimulation Project
Simulation Project
 

Recently uploaded

Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxTanveerAhmed817946
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 

Recently uploaded (20)

VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 

Bank of pecunia mortgage risk model

  • 1. Rui Cao Objective: Use SASEnterprise Minertoidentifypotentialrelationshipsbetweenmortgage defaultsandother mortgage transactioninformation. Withaddingprincipal componentsandclusterfunctions,we can reach a bettermodel thatcan predictfuture mortgage risk. Dataset: loandataHW_cleaned.csv variable definitionloandataHW.docx
  • 3. 5.Explanationof selectionof variables: For V16 to V20, afterrunninganalysisinTableau, I saw norelationbetweenthemandthe target Afterrunningthe remainingvariables,the resultsshowedmanyof the coefficientsare notsignificant, so I keptv1 v12 v14 and v9 at the end,andtheyare all significant.
  • 4. 6.Base Model Before runningthe base model,Iaddeddatapartitionandfilter tosplitthe dataand take out the extreme valuestohelpreacha bettermodel result. (1) The resultshowsAICis212959.99 andmisclassificationrate is0.04 with6 parameterestimates. In fact, the base model hasthe lowestAICamongothermodelswhichare showninthe followingpages. (2) V14 B and V14 V representBrokerandCorrespondent.The coefficientof Brokeris0.0434 meaningif the mortgage is soldbya broker,the possibilityof beingriskyincreases0.0434. The coefficientof Correspondentis -0.0346 meaningif the mortgage issoldbya correspondent,the possibilityof being riskydecreases0.0346.
  • 5. 7. Principal Componentmodel (1) I have tried differentpercentage of variationsforthe principal componentfrom0.8to 0.9. The higherthe percentage,the more accurate the model couldbe;however,the more principal components may have thusmakingitharder to explaineachprincipalcomponent,andvice versa.Asaresult,I decidedtouse 0.85 since the itledto the lowestAIC,same misclassificationrate andfewestparameter estimates. (2) All of the parametersare significant.The AICof the model is213075 slightlyhigherthanthe base model’s.Misclassificationrate is0.04. Parameterestimatesis5. (3) The firstPC explainsthe spanof the mostvariationof the variables.Afterexaminingthe table of each variable’sinfluence below,PC_1couldbe labelledasV12, PC_2 as V14_3, PC_3 as V14_1, PC_4 as V1, PC_5 as V9 and PC_6 as V14_2. The coefficientof PC_1equalsthe sumof eachvariable’sreads multipliedbytheirinfluence.Afterlabeling,we cansayone unitof increase inv12 (Original loan-to- value),the possibilityof beingariskymortgage increases7.7462. (4) Yes,since we can explaineachPCanduse lessparameterestimates,itcouldbe abettermodel.
  • 6.
  • 7. 8. (1) For Clustering,Ileft the minimumnumberof clustersasdefault2,since the software can recommenditself.Asaresult,8 clustersshowedupwiththe centroidmethod. (2) The model resultshowednotall the parametersare significant.AICis473548 and the misclassificationrate is0.07. Parameterestimatesis9. (3) The meaningof a clustershowshowa clusterisdifferentfromothers.Inthiscase forcluster1, ithas the highestaverage numberof v14_R being1. So,we can name the firstclusteras v14_R eventhough the nexthighestis0.989. Interms of the meaningof the coefficient,if the mortgage issoldbyretail,the possibilityof beingariskymortgage decreases8.7379.
  • 8. (4) Yes, eventhoughIfounda highestaverage numberforcluster1,it isnot significantly highcompared to otherclusters,sothismay notbe veryuseful.
  • 9. 9. In the end,eventhoughthe software selectedthe base model asthe best.Idecidedtouse the model withthe principal componentwith85%of variations.The reasonisthatwiththe same misclassification rate andsimilarAIC,the parameterestimatesforPC85% isthe lowestasit is5. Also,fromthe eigenvectortable,we canlabel eachPCa variable thusmakingiteasiertounderstand.
  • 10. 10. 1 The SAS System Thursday,February14, 2019 05:37:00 PM WARNING:Your systemisscheduledtoexpire onMarch 31, 2019, whichis45 daysfromnow.The SAS Systemwill nolongerfunctionon or afterthat date.Please contactyourSAS InstallationRepresentative toobtainyourupdatedSAS InstallationData(SID) file,whichincludesSETINITinformation. To locate the name of yourSAS InstallationRepresentativegotohttp://support.sas.com/repfinderand provide yoursite number 70094220 and companyname as SASONDEMANDFOR ACADEMICS.On the SASREP listprovided,locate the REP for operatingsystemLIN X64. You are runningSAS9. Some SAS8 fileswillbe automaticallyconverted by the V9 engine;othersare incompatible. Pleasesee http://support.sas.com/rnd/migration/planning/platform/64bit.html PROCMIGRATE will preservecurrentSASfile attributesandis recommendedforconvertingall yourSASlibrariesfromany SAS8 release toSAS9. Fordetailsandexamples,pleasesee http://support.sas.com/rnd/migration/index.html Thismessage iscontainedinthe SASnewsfile,andispresentedupon initialization. Editthe file "news"inthe "misc/base"directoryto displaysite-specificnewsandinformationinthe programlog. The command line option "-nonews"willpreventthisdisplay.
  • 11. 1 filename_emenvcatalog'sashelp.emwip.em_loadmacros.source'; 2 %inc_emenv; 1088 filename _emenv; 1089 %letWIP_PROJPATH=%nrstr(~); 1090 %letWIP_PROJNAME=%nrstr(ACC637); 1091 proc displayc=sashelp.emwip.em_init.scl;run; 7618 %letSYSCC=0; 7619 optionsVBUFSIZE=64M; 2 The SAS System Thursday,February14, 2019 05:37:00 P M 7621 %letSYSCC=0; 7622 %letSYSRC=0; 7623 %letEMEXCEPTIONSTRING=; 7624 %letSYSMSG=; 7625 %em_diagram(action=open,projpath=%nrstr(~),projname=%nrstr(ACC637),dgmId=EMWS2, userId=caoruidpu,sessionid=3aedbb88-3b33-4e67-87a9-34bba363d59a, outfile=DiagramOpenResponse.xml); WIP_ACTION: DGMID: EMWS2 LOCKFILE:~/ACC637/Workspaces/EMWS2/System/wsopen.lck 7637 %let_EM_TREECONVERSION=0; 7638 data _null_; 7639 setEMWS2.EM_NODEID end=eof; 7640 where upcase(Component)='DECISIONTREE'andCLASS= 'SASHELP.EMMODL.DECISIONTREE.CLASS';
  • 12. 7641 if eof thencall symput('_EM_TREECONVERSION','1'); 7642 run; treeconversion=0 7643 %letsyscc=0; 7644 filename _wipchkcatalog"EMWS2.Assoc.test.source"; 7645 data _null_; 7646 file _wipchk; 7647 put '/* Test */'; 7648 run; 7649 data _null_; 7650 rc = fdelete('_wipchk'); 7651 run; 7652 filename _wipchk; 7653 filename _wipxml'/saswork/SAS_work687C000007DF_odaws01-prod- us/SAS_workDCD4000007DF_odaws01-prod-us/DiagramOpenResponse.xml'encoding="UTF-8" NOBOM; 3 The SAS System Thursday,February14, 2019 05:37:00 P M WARNING:End of file. WARNING:End of file. 7654 %letSYSCC=0; 7655 %letSYSRC=0; 7656 %letEMEXCEPTIONSTRING=; 7657 %letSYSMSG=; 7658 %em_diagram(action=SETORIENTATION,projpath=%nrstr(~),projname=%nrstr(ACC637), dgmId=EMWS2, sessionid=3aedbb88-3b33-4e67-87a9-34bba363d59a, orientation=HORIZONTAL); WIP_ACTION:OPEN DGMID: EMWS2
  • 13. LOCKFILE:~/ACC637/Workspaces/EMWS2/System/wsopen.lck 4 The SAS System Thursday,February14, 2019 05:37:00 P M WARNING:End of file. WARNING:End of file.