• Save
Dr. Giovanni Tapang : Short Technical Note
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
415
On Slideshare
337
From Embeds
78
Number of Embeds
1

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 78

http://fightcorruption.nowplanet.tv 78

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Short technical note [Draft 3, 12 Sep 2013] Detected communities in the relationship networks of PDAF releases from 2007-2009 Gabriel Sison, Pamela Anne Pasion and Giovanni Tapang* National Institute of Physics, University of the Philippines Diliman gtapang@nip.upd.edu.ph Introduction The release of the Commission on Audit (COA) Special Audits Office Report No. 2012-03 [1] opened a wealth of data that gives us a glimpse of how the Priority Development Assistance Fund (PDAF) and the so-called Various Infrastructure, Including Local Projects (VILP) funds were used in government agencies. The report gives a government-wide Performance Audit of the PDAF and the VILP of various implementing agencies from 2007 to 2009. The data is in several tables and annexes that listed the non-governmental organizations (NGO) to whom funding was given by a legislator. One cannot immediately see the relationships between the NGOs and between legislators by simply looking at the tables. Visualizing such relationships and quantifying them is a problem of network analysis and visualization. The visualization of the PDAF relases is what we address in this technical note. We present a visualization at http://visser.ph/pdaf (using sigma.js). Network Analysis Network tools have been used in many applications to date. We have analyzed different systems such as prose and poetry [2], SMS messages [3], translations [4], poetic styles [5] and bill co-authorships in the Philippine Congress[6] among others. A network is a simple way to represent a set of objects or nodes who has a defined relation between each other. We call these objects a node or a vertex while we call the relationship between them as an edge [7]. Depending on the data set, edges could represent different kinds of relationships. In a social network, these could be friendship relations [8], or co- authorships in a Congressional setting[9]. Edges in networks can have values attached to them (weighted networks) or one can set a uniform weight for the edges of an unweighted networks [7]. Depending on the direction at which the relationship is defined we can have directed or undirected networks. Networks have been used to characterized political systems such as the United States Congress [10] and the Philippine House of Representatives[6]. In such networks, nodes (called ego or actors) are the legislators themselves and links between them can be voting patterns or co-sponsorships of bills and resolutions[6] [or common allocations to organizations such as in this paper]. These co-sponsorship networks can be used as proxies for effective political party affiliation of the legislators which can be derived from calculating partitions, or communities, that arise as a result of their level of partisanship[6,10]. In general, nodes or actors can be persons, groups or organizations. In the current work, we have our nodes as the individual senators, congressmen, the implementing agencies (IA) and the NGOs that received their PDAF allocations. Methdology We took data from Annex A of the COA special report[1] and converted this to a table using Python. We then further processed the resulting data using an open source network analysis program Gephi[11]. We have taken two visualizations of the data as presented by looking at the network built from legislators who have released funds to a common NGO and
  • 2. another network where NGOs who received funds from the same legislator is linked together. The two networks taken together defines a bipartite network but we only present visualization and analysis of the two networks taken separately. The first network that we created is a network where a legislator, represented by a node, is connected to another legislator if the two of them provided funds to the same NGO. We applied community detection algorithms in Gephi[11] to determine if there were groups of legislators that were likely to fund NGOs together. The legislators were colored based on what “community” they fell into. The same was done to make a network of NGOs to which the PDAF were transferred. In this network, a connection is made between two NGOs if they received funds from the same legislator. As these nodes tend to connect more with certain sets of nodes, we use the same community detection methods to find what NGOs are connected with each other more and group them together by color. As with the legislator networks, the colors are based on the communities that the NGOs fall into. Results and Analysis The legislator network that we obtained has 186 nodes with 3976 edges. We find that the network has an average degree of 21.38. This implies that, on the average, a legislator distributes his PDAF to the same NGO as 21.38 other congressmen as the degree is the number of connections that a node has. In this network, it measures how many other legislators he or she shared the same beneficiary NGO. Certain legislators, based on the COA report, have high degrees. For example, Rep. Adam Relson Jala tops the list at 73 with Arrel Olano and Mariano Piamonte follow at 66 and 64 respectively. On the average, each legislator have given his PDAF to 1.4 NGOs. The degree of a node does not reflect the amount that they gave from their PDAF since for Rep. Jala, he allocated only P32.086 million for the period in review. This is just an average of P0.43 million per NGO while the top releases belong to Senators Ramon Revilla (at P503.89 million for eight NGOs), Jinggoy Estrada (P491.495 million for four NGOs) and Juan Ponce Enrile (P469.49 million for 11 NGOs). In the legislator network, they have degrees of 30, 24 and 49 respectively. To round up the top 5 in PDAF releases, Rep. Philip Pichay gave P180 million to two NGOs and Senator Angara gave P151 million to seven NGOs. Their degrees are 32 and 9 respectively. We show in Figure 1 the legislator network and their detected communities. We found six (6) communities. These communities are legislators that tend to give to the same set of NGOs together. We can rank the legislators based the number of “partners” they have, of which Adam Relson Jala tops the list. This is reflected in the high degree or number of connections that he has. We can also add weight to these links by the number of times that two linked legislators funded a common NGO. In such a weighted network, Nerissa Corazon-Ruiz becomes the most connected legislator. A more advanced technique is betweenness centrality which measures how “central” a given node is, which can be seen as a proxy for influence within the network[10]. Sen. Juan Ponce Enrile tops that list. See Table 1 for more details.
  • 3. We repeated the same thing for the NGO network. The NGO network shows an average degree of 7.68 which implies that 7.68 NGOs typically receive allocations from the same legislator. The NGO network has 69 nodes with 530 edges. We found at least five (5) groups of nodes or communities. Each groups are represented by a separate node color. As such, nodes with the same color are NGOs which are more likely to have receive funds from the same legislator. Figure 1: Legislator network and detected communities. Sizes of the nodes are proportional to the betweenness centrality.
  • 4. In Figure 2, the thickness of the arrows represents the weight of how the NGOs are connected with each other. The stronger the weight of the arrows, the greater their connection is. Thicker arrows are funded together more often by more than one legislator. It can be seen that the nodes in blue and green groups have more weighted edges than the other groups especially in the SDPFFI NGO in the blue group and KKAMFI NGO in the green group. We could also apply some other network techniques such as looking at which node are importante nodes in the NGO network. This is measured by the eigenvector centrality[11]. The NGO which has the highest eigenvector centrality or the measure of importance of the node is the MAMFI NGO. This is followed by the CARED and SDPFF respectively with all of which have measure of greater than 0.90 eigenvector centrality. See Table 2 for more numbers. Node label Betweenness Centrality Weighted Degree Clustering Coefficient Eigenvector Centrality Closeness Centrality 1 Juan Ponce Enrile 0.1034 140.0 0.2661 0.26504 0.5378 2 Arrel R. Olano 0.1024 87.0 0.3529 0.70891 0.5640 3 Ignacio T. Arroyo, Jr. 0.0750 95.0 0.5340 0.74083 0.5441 4 Adam Relson L. Jala 0.0650 110.0 0.3916 1.0 0.5763 5 Francisco T. Matugas 0.0649 78.0 0.4713 0.77308 0.5425 6 Edgardo J. Angara 0.0532 12.0 0.2778 0.03684 0.4048 7 Mariano U. Piamonte 0.0403 106.0 0.4638 0.95405 0.5378 8 Emmanuel Joel J. Villanueva 0.0402 50.0 0.4366 0.43666 0.5082 9 Samuel M. Dangwa 0.0382 92.0 0.3286 0.26147 0.4973 10 Marc Douglas C. Cagas IV 0.0356 66.0 0.4473 0.46821 0.5211 Table 1. Various network parameters in the legislator network for the top-10 legislators based on their betweeness centrality. Betweenness centrality is defined as the number of shortest paths from all vertices passing through a node. The weighted degree is proportional to the number and strength of connections of a node. The clustering coefficient measures the degree to which nodes tend to cluster with one another. The eignevector centrality is a measure of the importance of a node in the network and is related to how well connected a particular node is. The closeness centrality is a measure of the average distance from a given strarting node to all other nodes in the system. The centrality measures are normalized in this table. These are just a sample of the things we can do with the network representation of the PDAF releases. Deeper knowledge about Congress and the interlocking directorships of the NGOs would be also be extremely helpful in further analysis. This will be done in a future work. Nevertheless these tools allow the average Filipino to glean information readily as opposed to tables and documents, helping them better participate in the process of democracy.
  • 5. Node label Betweenness Centrality Weighted Degree Clustering Coefficient Eigenvector Centrality Closeness Centrality 1 Kagandahan ng Kapaligiran Foundation, Inc. (KKFI) 0.1982 21.0 0.3429 0.8683 0.5397 2 Kabuhayan at Kalusugan Alay sa Masa Foundation, Inc. (KKAMFI) 0.1823 23.0 0.2332 0.6561 0.5574 3 Dr. Rodlofo A. Ignacio, Sr. Foundation Inc (DRAISFI) 0.1275 23.0 0.3360 1.0 0.5667 4 Farmerbusiness Development Corp (FDC) 0.0984 18.0 0.3399 0.7372 0.5231 5 Aaron Foundation Philippines Inc (AFPI) 0.0915 15.0 0.2380 0.5067 0.4892 6 Masaganang Ani Para sa Magsasaka Foundation Inc (MAMFI) 0.0903 21.0 0.4048 0.8651 0.5397 7 Pangkabuhayan Foundation (Pang-FI) 0.0838 21.0 0.3524 0.8509 0.4963 8 Kaagapay Magpakailanman Foundation Inc (KMFI) 0.0681 17.0 0.3088 0.6091 0.5312 9 ITO NA Movement Foundation Inc (ITO NA MI) 0.0571 11.0 0.3818 0.4734 0.4755 10 Hand-Made Living Foundation Inc (HMLFI) 0.0451 10.0 0.3556 0.4486 0.4626 Figure 2: NGO networks and detected communities.
  • 6. Table 2. Various network parameters in the NGO network for the top-10 NGOs based on their betweeness centrality. Betweenness centrality is defined as the number of shortest paths from all vertices passing through a node. The weighted degree is proportional to the number and strength of connections of a node. The clustering coefficient measures the degree to which nodes tend to cluster with one another. The eignevector centrality is a measure of the importance of a node in the network and is related to how well connected a particular node is. The closeness centrality is a measure of the average distance from a given strarting node to all other nodes in the system. The centrality measures are normalized in this table. References 1. Commision on Audit Special Audits Office, Report No. 2012-03 Government-wide Performance Audit, “Priority Development Assistance Fund (PDAF) and Various Infrastructures including Local Projects (VILP)”, 2012 2. RM Roxas, G Tapang, “Prose and Poetry Classification and Boundary Detection Using Word Adjacency Network Analysis”, International Journal of Modern Physics C 21 (04), 503-512 3. JJT Cabatbat, GA Tapang, “Texting Styles and Information Change of SMS Text Messages in Filipino”, International Journal of Modern Physics C 24 (02) 4. JJT Cabatbat, JP Monsanto, GA Tapang, “Preserved Network Metrics Across Translated Texts,” International Journal of Modern Physics C (accepted paper 2013) 5. RM Roxas-Villanueva, MK Nambatac, G Tapang, “Characterizing English poetic style using complex networks”, International Journal of Modern Physics C 23 (02) 6. Gabriel Dominik Sison, “Edge-weight distributions in dense small node co-authorship networks“, BS Thesis, BS Physics, UP Diliman April 2013 7. Alain Barrat, Marc Barth elemy, and Alessandro Vespignani. Dynamical Processes on Complex Cambridge University Press, New York, USA, 2008. 8. Hua Wang and Barry Wellman. Social connectivity in america: changes in adult friendship network size from 2002 to 2007. American Behavioral Scientist, 53(8):1148{1169, 2010. 9. James H Fowler. Connecting the congress: A study of cosponsorship networks. Political Analysis, 14(4):456-487, 2006. 10. Yan Zhang, AJ Friend, Amanda L Traud, Mason A Porter, James H Fowler, and Peter J Mucha. Community structure in congressional cosponsorship networks. Physica A: Statistical Mechanics and its Applications, 387(7):1705{1712, 2008. 11. Bastian M., Heymann S., Jacomy M. (2009). Gephi: an open source software for exploring and manipulating networks. International AAAI Conference on Weblogs and Social Media. About the authors Pamela Anne Pasion is a fourth-year BS Applied Physics student at the National Institute of Physics working on translations and network analysis. Gabriel Dominik Sison has finished his BS Physics degree in 2013 with an award for Best BS Thesis on his work “Edge-weight distributions in dense small node co-authorship networks“. Dr. Giovanni Tapang is an Associate Professor at the National Institute of Physics and is also the chairperson of the scientist group AGHAM- Advocates of Science and Technology for the People. He can be reached at gtapang@nip.upd.edu.ph