Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

2011-07-09 데이터 탐색

128 views

Published on

!!

Published in: Government & Nonprofit
  • Be the first to comment

  • Be the first to like this

2011-07-09 데이터 탐색

  1. 1. POPONG meeting (2011-07-09) 작성자: 박은정, 박주성 1. Raw data A. 국회 의안 정보 시스템 1 - 제 18 대 국회 의안 중 처리 의안 2. Data preprocessing (박주성) A. 의원 별 참여 의안에 대한 binary matrix 구성 (단, 의안 이름이 같은데 발의자 구성이 같은 경우에는, 하나의 record 로 간 주) 의안 1 의안 2 의안 3 … 의안 m 의원 1 1 0 0 … 1 의원 2 1 0 1 … 0 … … … … … … 의원 n 0 1 0 … 1 B. Case 분류 및 중복제거: 총 6 가지 case 에 대한 분석 의원수(n) 의안수(m) All (D1) 329 3262 Accepted (D3) 329 505 Rejected (D5) 329 2757 All + 노이즈제거 2 (D2) 329 2856 Accepted + 노이즈제거 (D4) 329 472 Rejected + 노이즈제거 (D6) 329 2384 (Spotfire ‘BillMemberAnalysis_0709.dxp’ 파일 참고) 1 http://likms.assembly.go.kr/bill/jsp/main.jsp 2 공동발의자수가 150 명 이상인 경우를 제외 1
  2. 2. 3. Data Analysis (박은정) A. Similarity calculation: 세 가지 set similarity measure 사용 B. Similarity Based Hierarchical Clustering - 의원 간의 상관계수를 계산하여 그들간의비유사성(dissimilarity) 계산 Dissimilarity = 1 - Abs(Similarity) - 위에서 계산된 비유사성을 이용하여 계층군집 알고리즘 적용 3 ※ 계층군집 알고리즘 • 각 개체 간의 비유사도를 기반으로 가장 가까운 것 두 개를 묶는다. • 묶은 개체 둘은 다시 하나의 개체로 간주하여 다시 앞 과정을 반복한다. • 앞 두 과정을 전체가 하나로 묶일 때까지 반복한다. 3 Single linkage(minimum distance) 사용 2
  3. 3. 314318313126791348651582012322552591612292542492251962702742317118628067185116269137241143104188284722878320810317316617911010719816105272167601532164125019421524522114437108246180140444915024369222356322026116223913611774251301426812023625873708177163156565825224226750514120266151062271021111151132851742131901092023920328661532915422817547621121192091571024139219462234354251491711852121552611176135212422114724738122511836571413427513815123027114812411466427952532181268014531287712913213320623724826513132100204256987982191192224260276164183200283234262861602641652789324017028125720726385197195339026816821121713023319923884295296303298302324319205226304922732441892821272972993012276182649915217121475169181277193871997279311312315320321322326316323325317101452101872317889948117228782351842902883430028996159551211231252922913054014629459913073083188293309310328306327 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 JaccardSimilarity basedHierarchical ClusteringDendrogram(D1) 318322314313300328315321320255316326307201159312877133130129132214233206168131846423724826523225915841495134275138151253230271126218148801242714511412112312529229130531196118571225966364021505623961252106118582162691762475326715564425115611622111924383231254194113198284228266115272427169707437431091112744714917910193210020425698901977618228121023875169181922732442821891271712688717833821911922572242602761642001832342641602838626216593240195207263278199211217798517020519397279221529945277941011872902389782822617281235341842962983023193033173233253243041317317254610414413614029236163209112141157621501421201371862201472271081392038152515238153711051171431672081622192415426191212397225088110263123106728811122231772942428519069412871753121565146306301352021741962255213102180229245161293280733092496360107327491851032221544818825828624228968308201662463555297299301270295 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 JaccardSimilarity basedHierarchical ClusteringDendrogram(D2) 3
  4. 4. 323326322320325321317316312311315295296298304303302297301299319324379167470228223427619119216028327822426024025425516484237248265206201196165852812322592802692631864821619923113212913127016193711874599867967274185183158728326275204225602296311010020010526897277250193107101325665120137104239173172169238581432412611164915619820518198272195144287170140284264411942172566921521118910824644208142813724328207285150233117922732448923771366412817922011515218016815318825219722610610316219016312717116720962395020224525823627925117428621387111411821322217817711178682497661154359028856267525324220171522815726611384621184228222355431751013310921911817647149119941661462031391121352921228954182521019122261555124714723540425734339615925721422113030383612662429430073313073093103083133143183273282933051212921231252912901022271884142795253218138275230151271124145134148801261143069159 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 JaccardSimilarity basedHierarchical ClusteringDendrogram(D3) 326327323322320318315314313300248328325321191317316312307310309257255303822342761591923112373081602832983042962952932782652242602402941642073132431930530229730129929125928025423220116912512384206962688579221991812162691659918516120593187200101486792273244127217170105459832100121292204281277263233274158195197861327577129131290231214210193711732562251892825889262196833724328172238107769018233791674703919878264110183241041401441201372491361426514114363972291024123981284116220815218043871711862722701116762163209150221116449431686927911719421134146219178502526422619023649722502356212813013341287156202154188115203061561132452356811117983017721560251617465326711322825421188814947292231532661031922028515710826113912251061741811226135147166212211092885517528666247176409128959258246184242513811952102203355415521322257361273227427951262532182751382301341518027114512414148114 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 JaccardSimilarity basedHierarchical ClusteringDendrogram(D4) 4
  5. 5. 314318313328322321315320316326300311307312126791316486515823225525920116124922927422525423127019671671861042801851161661371882211432692842872411031672081791738372272110144245198153411071056014030136731941802223725021521635243108246441507422725169142492201626810281172362612586312017723970267242141163201526651565650252581111131061153910920317429213285202112154190286536147228621751191020924139431572232521954149461181226171555242176135212111413427513815123027114812411438147247211225118365766427809525321812614512112312529229130531287713312913221413020623316884642372482651311932100204256981979076182210281238751691718792273244189282127181338219119225722426016427618320026428323486262160207165932407921121719526317027829094178205152268851931999727927722699452322101187897881172282353418429630332332529830231931732430459312949196159401463088832730629729930129331030929528928855 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 JaccardSimilarity basedHierarchical ClusteringDendrogram(D5) 318322314313328300321315255320307232316326265237248259158312877133129132214130206233168846413132731120131229632332530331729830231932441495134275138151253230271126218148801242714511412112312529229130519321002042569819790761822102812387516917187338219119225722426016427618320026428323486262160207165932902401701952787921121726394922732441892821271812052681788515219319997279277226994523221011878981781722352830434184310159961611857596612236215056239612521061185821617626930640115510813920347149313092972993015326725011024714772212541317313617254622730186220177644251221156831162312541192431531941131982842721152282661906542671051171431672081622192412612882491308431091115152381112223292361121411632091572157120228527429419622526607916707414624517581528029310482132871796215073188371354917414063214222222910414451372496912018025818541881542862462702421071663529510210368552039289 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 JaccardSimilarity basedHierarchical ClusteringDendrogram(D6) 5
  6. 6. 318328314313322315321320316326300312311327307309310154256716481582322592552011612292742492842801988323125411622571270656719644251221272156216269185105117143167208241162581535623961252106107104144136140137219118186220194287142119243110722503018810326713173120261505311214111117911337115176180157910974228266166177691725462916323621520962150245213704360202411081392032118571226659258222364942826190633968111222314717447149102471510215524242383573227175212285154312877133130132129233214131206168846423724826541495134275138151253230271218126148801242714511412112312529229130524651521354020288193298100204256901977618221028123892273244282189127751691712681781818733821911922572602242761642641838623420016028326224016593207195290278263170152193857919921121799205277452394972792289101187226235781728128296303317323325298302319304184343249128629430814688961593062932892953129729930155 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 JaccardSimilarity basedHierarchical ClusteringDendrogram(D7) 322328318314313300315321320255327316326307201312877133130132129233214131206168846423724815823226525941495134275253138151230218271126148801242714511412112312529229130531231116111857122596621365056239612525810611821626917640425326725072113228266198284272542472741392031551156442511561192431168323125422119415310847149911932981002042569019776182210281238922732442821891277516917126817818187338219119225726022427616426418386234290200160283240931652621952072782631931521707919921121799852052772345972798922101187942352267817228812963033173233252983023193041843432411122231317317254614729236163209621501041441361401121411571421371101201862202630711051172081431622191672412614310911117938212288213294215693093108865632422730651521771352852021907319622514617410217524522994828010180157162702249961597015430849602422891882223774396741287293258852035185166107312972993012862466855295103 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 JaccardSimilarity basedHierarchical ClusteringDendrogram(D8) 6
  7. 7. 322326321320325317316311323312315295296304303302298297299301319324288289377133841281291681322061312482652376419224516482260183160268283191224281192234200277278226193263256204931957916517186264244211262199205189891277517832197181217169100101238276851522398170187172879227399812407628789727928218490182233207343325794235210146130214554029430096159290313073084279525321814580126913133143181291670746131364414011618513719622525520820125418627015323119416115810472652322592431632516217360220120482501425821667280167831432237122924527226923617969156144291062462411881172021802585020215162616326111024217710814125223919813949154112203562520911541285287471745415011310526635175166249103274190534221338149461576822821912237521221107222119512672628610911817135284176112473221102227111212155530433971881514724757361066141382751512301482711241341143271211252922911233053283062933103095988 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 JaccardSimilarity basedHierarchical ClusteringDendrogram(D9) 322326321320318315314313307300255325317316311186295265225196296304298324303302319310297299301208140137442912211161851593283233122012543093129030572270231161153656030829429312112529225924913124823712812916813220621413013136622321662881041632431431671582967194222383219718175199332571912242811602682838226019220018323427745164278263256193226792049328217126486195165244189211178169101127982628789100942172052769928152859727918224023769227378172210349020717018737713323312381235120712292452232512201843275058216106615628017311719846427226923663102534825014214418023925224126728973246412871416110242179261204969174306188105156202162258222139113198228108112203285964210325465119035177541152272115783215683051191222741461501549147149175129167074433987151028437402661182410717109111288262092191351112472862133818593657155212521761476655414126134138275151230148271279525312480218145114 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 JaccardSimilarity basedHierarchical ClusteringDendrogram(D10) 7
  8. 8. 327328310309306293315929129230512612112512388313314318124257807310221415996221227290272879111718132021145252616286285284280275274272352713738270269412672664426146474849505152535425956575825860616263255656667686970717225474252251250246245243242241832392362322312302292282252232222202192132122092082032022011031041051061071081981101111121131141151161171181191201961221941241901881861851801791771761741731351361371381391401631421431441581571471481491501511561928315222231711722728160321691782812781811822771842731641871651892681911921934519526419726319926225676752042052537810110079211999881244822172188524086952382249322687922349089170260276971831532153921615416168247101551529364316242175141167109127200316826523713384131132129128145642482067727912166134249207282309421023523342894055342882952963023033042972993012981463193243153113123163173203213223233253263313091308307294300 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 JaccardSimilarity basedHierarchical ClusteringDendrogram(D11) 327328326123325567891011121314151617183232021322321242526320318293173131631531435363738313312414243443114647484950515253543105657583096061626364656667686970717273743083077730630080295293838429028828788286285912842802752742722712702692672661021031041051061071082651101111121131141151161171181191202611222591242582572551281292541311321331341351361371381391401411421431441452521471481491501512512502481551561571581592472461621632452431661671682422412392371731741751761772361791802332322312301851862291882281902252232221942211962201982192142012022032132122062092084193242220721120423282172182051991971952002241932262719118919218718433234319182178238181240172171169244321651641603440451522531467625655127125591232622632641217512626810110099982737996782772788295281282283305929389902898729129286304852962972982998130130230321018329494276972352791702602161542493039130153161227215109 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 JaccardSimilarity basedHierarchical ClusteringDendrogram(D12) 8

×