3. 5.Explanationof selectionof variables:
For V16 to V20, afterrunninganalysisinTableau, I saw norelationbetweenthemandthe target
Afterrunningthe remainingvariables,the resultsshowedmanyof the coefficientsare notsignificant, so
I keptv1 v12 v14 and v9 at the end,andtheyare all significant.
4. 6.Base Model
Before runningthe base model,Iaddeddatapartitionandfilter tosplitthe dataand take out the
extreme valuestohelpreacha bettermodel result.
(1) The resultshowsAICis212959.99 andmisclassificationrate is0.04 with6 parameterestimates. In
fact, the base model hasthe lowestAICamongothermodelswhichare showninthe followingpages.
(2) V14 B and V14 V representBrokerandCorrespondent.The coefficientof Brokeris0.0434 meaningif
the mortgage is soldbya broker,the possibilityof beingriskyincreases0.0434. The coefficientof
Correspondentis -0.0346 meaningif the mortgage issoldbya correspondent,the possibilityof being
riskydecreases0.0346.
5. 7. Principal Componentmodel
(1) I have tried differentpercentage of variationsforthe principal componentfrom0.8to 0.9. The
higherthe percentage,the more accurate the model couldbe;however,the more principal components
may have thusmakingitharder to explaineachprincipalcomponent,andvice versa.Asaresult,I
decidedtouse 0.85 since the itledto the lowestAIC,same misclassificationrate andfewestparameter
estimates.
(2) All of the parametersare significant.The AICof the model is213075 slightlyhigherthanthe base
model’s.Misclassificationrate is0.04. Parameterestimatesis5.
(3) The firstPC explainsthe spanof the mostvariationof the variables.Afterexaminingthe table of each
variable’sinfluence below,PC_1couldbe labelledasV12, PC_2 as V14_3, PC_3 as V14_1, PC_4 as V1,
PC_5 as V9 and PC_6 as V14_2. The coefficientof PC_1equalsthe sumof eachvariable’sreads
multipliedbytheirinfluence.Afterlabeling,we cansayone unitof increase inv12 (Original loan-to-
value),the possibilityof beingariskymortgage increases7.7462.
(4) Yes,since we can explaineachPCanduse lessparameterestimates,itcouldbe abettermodel.
6.
7. 8.
(1) For Clustering,Ileft the minimumnumberof clustersasdefault2,since the software can
recommenditself.Asaresult,8 clustersshowedupwiththe centroidmethod.
(2) The model resultshowednotall the parametersare significant.AICis473548 and the
misclassificationrate is0.07. Parameterestimatesis9.
(3) The meaningof a clustershowshowa clusterisdifferentfromothers.Inthiscase forcluster1, ithas
the highestaverage numberof v14_R being1. So,we can name the firstclusteras v14_R eventhough
the nexthighestis0.989. Interms of the meaningof the coefficient,if the mortgage issoldbyretail,the
possibilityof beingariskymortgage decreases8.7379.
9. 9. In the end,eventhoughthe software selectedthe base model asthe best.Idecidedtouse the model
withthe principal componentwith85%of variations.The reasonisthatwiththe same misclassification
rate andsimilarAIC,the parameterestimatesforPC85% isthe lowestasit is5. Also,fromthe
eigenvectortable,we canlabel eachPCa variable thusmakingiteasiertounderstand.
10. 10.
1 The SAS System Thursday,February14, 2019 05:37:00 PM
WARNING:Your systemisscheduledtoexpire onMarch 31, 2019, whichis45 daysfromnow.The SAS
Systemwill nolongerfunctionon
or afterthat date.Please contactyourSAS InstallationRepresentative toobtainyourupdatedSAS
InstallationData(SID)
file,whichincludesSETINITinformation.
To locate the name of yourSAS InstallationRepresentativegotohttp://support.sas.com/repfinderand
provide yoursite number
70094220 and companyname as SASONDEMANDFOR ACADEMICS.On the SASREP listprovided,locate
the REP for operatingsystemLIN X64.
You are runningSAS9. Some SAS8 fileswillbe automaticallyconverted
by the V9 engine;othersare incompatible. Pleasesee
http://support.sas.com/rnd/migration/planning/platform/64bit.html
PROCMIGRATE will preservecurrentSASfile attributesandis
recommendedforconvertingall yourSASlibrariesfromany
SAS8 release toSAS9. Fordetailsandexamples,pleasesee
http://support.sas.com/rnd/migration/index.html
Thismessage iscontainedinthe SASnewsfile,andispresentedupon
initialization. Editthe file "news"inthe "misc/base"directoryto
displaysite-specificnewsandinformationinthe programlog.
The command line option "-nonews"willpreventthisdisplay.
11. 1 filename_emenvcatalog'sashelp.emwip.em_loadmacros.source';
2 %inc_emenv;
1088 filename _emenv;
1089 %letWIP_PROJPATH=%nrstr(~);
1090 %letWIP_PROJNAME=%nrstr(ACC637);
1091 proc displayc=sashelp.emwip.em_init.scl;run;
7618 %letSYSCC=0;
7619 optionsVBUFSIZE=64M;
2 The SAS System
Thursday,February14, 2019 05:37:00 P
M
7621 %letSYSCC=0;
7622 %letSYSRC=0;
7623 %letEMEXCEPTIONSTRING=;
7624 %letSYSMSG=;
7625 %em_diagram(action=open,projpath=%nrstr(~),projname=%nrstr(ACC637),dgmId=EMWS2,
userId=caoruidpu,sessionid=3aedbb88-3b33-4e67-87a9-34bba363d59a,
outfile=DiagramOpenResponse.xml);
WIP_ACTION:
DGMID: EMWS2
LOCKFILE:~/ACC637/Workspaces/EMWS2/System/wsopen.lck
7637 %let_EM_TREECONVERSION=0;
7638 data _null_;
7639 setEMWS2.EM_NODEID end=eof;
7640 where upcase(Component)='DECISIONTREE'andCLASS=
'SASHELP.EMMODL.DECISIONTREE.CLASS';
12. 7641 if eof thencall symput('_EM_TREECONVERSION','1');
7642 run;
treeconversion=0
7643 %letsyscc=0;
7644 filename _wipchkcatalog"EMWS2.Assoc.test.source";
7645 data _null_;
7646 file _wipchk;
7647 put '/* Test */';
7648 run;
7649 data _null_;
7650 rc = fdelete('_wipchk');
7651 run;
7652 filename _wipchk;
7653 filename _wipxml'/saswork/SAS_work687C000007DF_odaws01-prod-
us/SAS_workDCD4000007DF_odaws01-prod-us/DiagramOpenResponse.xml'encoding="UTF-8"
NOBOM;
3 The SAS System
Thursday,February14, 2019 05:37:00 P
M
WARNING:End of file.
WARNING:End of file.
7654 %letSYSCC=0;
7655 %letSYSRC=0;
7656 %letEMEXCEPTIONSTRING=;
7657 %letSYSMSG=;
7658 %em_diagram(action=SETORIENTATION,projpath=%nrstr(~),projname=%nrstr(ACC637),
dgmId=EMWS2, sessionid=3aedbb88-3b33-4e67-87a9-34bba363d59a, orientation=HORIZONTAL);
WIP_ACTION:OPEN
DGMID: EMWS2