Mapping the web

1,076 views

Published on

Second day lesson of the course "Epistemologias Reticulares" at the University of Sao Paulo, Escola de comunicaçoes et artes (ECA/USP).
Cartography of the web for beginners

Published in: Education, Technology, Design
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,076
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Mapping the web

  1. 1. MAPPING THE WEB Marta Severo – Université de Lille 3, Laboratoire Gériico marta.severo@univ-lille3.fr 13 August 2013, University of Sao Paulo, Escola de comunicaçoes et artes (ECA/USP)
  2. 2. MAPPING THE WEB THE PRINCIPLE The web mapping is based on the idea that hyperlinks created on the web can be used as a proxy of social ties
  3. 3. WEB MAPPING THE PRACTICE We generate a graph that traces the network created by hyperlinks on a set of web pages
  4. 4. MAPPING THE U.S. BLOGOSPHERE (2004)  Méthodes numériques Divided they Blog Adamic & Glance, 2005
  5. 5. Govcom.org, 2008
  6. 6. FRENCH POLITICAL BLOGOSPHERE http://politicosphere.blog.lemonde.fr/ (Linkfluence)
  7. 7. CAN WE MAP THE WEB?
  8. 8. WHAT IS IT THE WORLD WIDE WEB? The World Wide Web (abbreviated as WWW or W3, commonly known as the web), is a system of interlinked hypertext documents accessed via the Internet. With a web browser, one can view web pages that may contain text, images, videos, and other multimedia, and navigate between them via hyperlinks. (Wikipedia)
  9. 9. http://internet-map.net THE RISK OF WEB MAPPING
  10. 10. HOW TO GET AN EFFECTIVE AND READABLE MAP OF THE WEB?
  11. 11. TO BECOME A WEB MAPPER YOU HAVE TO... Understand the morphology of the Web Know how to build a corpus of web sites Know how to represent a corpus of web sites
  12. 12. THE MORPHOLOGY OF THE WEB: THE POWER LAW Barabási, Albert-László (2002) Linked: The New Science of Networks
  13. 13. NETWORKS OF SCIENTIFIC PAPERS   D. De Solla Prince, 1965 Science, 149(3683) : 510-515
  14. 14. WHAT DOES THE WEB LOOK LIKE?
  15. 15. WHAT DOES THE WEB LOOK LIKE?
  16. 16. the higher layer (Fishusa.com) the top layer (Wikipedia.org) the lower layer (Thediaryofalakenerd.blogspot.com) the middle layer (Icefishingtoday.com) THE LAYERS OF THE WEB
  17. 17. First page of Google (interested users find them) First 2/3 results of Google (everyone see them) Not showing/indexed (nowhere to be found) First 10 pages of Google (experts find them) higher layer top layer lower layer middle layer THE LAYERS OF THE WEB
  18. 18. higher layer top layer lower layer middle layer THE LAYERS OF THE WEB
  19. 19. higher layer top layer lower layer middle layer THE LAYERS OF THE WEB
  20. 20. TO BECOME A WEB MAPPER YOU HAVE TO... Understand the morphology of the Web Know how to build a corpus of web sites Know how to represent a corpus of web sites
  21. 21. SOFTWARE: WEB CRAWLER  Automatic crawler (Issuecrawler)  Manual crawler (Navicrawler)
  22. 22. Automatic crawler https://www.issuecrawler.net/
  23. 23. RISKS OF AUTOMATIC CRAWLER
  24. 24. RISKS OF AUTOMATIC CRAWLER
  25. 25. RISKS OF AUTOMATIC CRAWLER
  26. 26. RISKS OF AUTOMATIC CRAWLER
  27. 27. http://webatlas.fr/wp/navicrawler/ MANUAL CRAWLER: NAVICRAWLER
  28. 28. DEFINE THE CORPUS
  29. 29. Tear Difficult choice Cut DEFINE THE CORPUS
  30. 30. THE CORPUS ON THE MAP
  31. 31. entering the domain 1 excluding top layer including the nebula including the core exploring the filaments 2 3 4 5 WEB MAPPING FROM THE PRATICAL VIEWPOINT
  32. 32. http://webatlas.fr/wp/navicrawler/ NAVICRAWLER
  33. 33. TURN ON NAVICRAWLER http://webatlas.fr/wp/navicrawler/
  34. 34. information on the page information on the site information on the corpus lists sort and find WINDOW « NAV »
  35. 35. What is a URL? What is a domain name? WHAT IS A WEBSITE ?
  36. 36. IF WEBSITES ARE SMALLER OF A DOMAIN NAME..
  37. 37. IF WEBSITES ARE BIGGER THAN A DOMAIN NAME…
  38. 38. SOCIAL NETWORKS  Facebook is just on node https://www.facebook.com/pages/Atopos/ 150996404962854  Twitter accounts are separate nodes but links in tweets can not be identified https://twitter.com/atopos_usp
  39. 39. CONSTITUTION OF THE CORPUS www.webatlas.fr
  40. 40. ATTENTION TO THE DEEPNESS
  41. 41. ATTENTION TO THE DISTANCE
  42. 42. TO BECOME A WEB MAPPER YOU HAVE TO... Understand the morphology of the Web Know how to build a corpus of web sites Know how to represent a corpus of web sites
  43. 43. GEPHI https://gephi.org/
  44. 44. THE ANALYSIS OF THE GRAPH GIVES US THREE TYPES OF INFORMATION 1. LAYOUT : Applying an algorithm force- vector 2. RANKING : Applying a degree classification 3. PARTITION : Applying partition by color
  45. 45. 1. LAYOUT > PROXIMITY  Two nodes are close if the sites they represent are directly or indirectly linked.  Questions :   1.1 Which are the debates or communities? (identification of clusters of nodes)   1.2. What are the sites that connect debates / communities? (identification of bridge between clusters)
  46. 46. ALGORITHM FORCE-VECTOR
  47. 47. 1.1. WHICH ARE THE DEBATES OR COMMUNITIES? (IDENTIFICATION OF CLUSTERS)
  48. 48. 1.1. WHICH ARE THE DEBATES OR COMMUNITIES? (IDENTIFICATION OF CLUSTERS)
  49. 49. 1.2. WHAT ARE THE SITES THAT CONNECT DEBATES / COMMUNITIES? (IDENTIFICATION OF BRIDGE BETWEEN CLUSTERS)
  50. 50. 2. RANKING > AUTORITHIES AND HUBS  The size of the nodes may be proportional to the authority of the site (in-degree) or to its role of information relay (out-degree).  Questions:   2.1. What sites are opinion leaders of online debate? (identification of graph autorities)   2.2. What are the sites that bring together the online debate? (identification of graph hubs)
  51. 51. 2.1. WHAT SITES ARE OPINION LEADERS OF ONLINE DEBATE? (IDENTIFICATION OF GRAPH AUTORITIES)
  52. 52. 2.2. WHAT ARE THE SITES THAT BRING TOGETHER THE ONLINE DEBATE? (IDENTIFICATION OF GRAPH HUBS)
  53. 53. 3. PARTITION > CATEGORIZATION  The color of the nodes can be changed to show different categories.  Question :   How are distributed the different types of sites? (evaluation of the topology)
  54. 54. 3.1. HOW ARE DISTRIBUTED THE DIFFERENT TYPES OF SITES? (EVALUATION OF THE TOPOLOGY)
  55. 55. 3.1. HOW ARE DISTRIBUTED THE DIFFERENT TYPES OF SITES? (EVALUATION OF THE TOPOLOGY)
  56. 56. EXAMPLES
  57. 57. CONTROVERSY MAPPING http://controverses.sciences-po.fr/archive/decroissance/ Auteur : étudiants de Sciences Po – Paris (cours de cartographie des controverses)
  58. 58. CONTROVERSY MAPPING http://controverses.sciences-po.fr/archive/decroissance/ Auteur : étudiants de Sciences Po – Paris (cours de cartographie des controverses)
  59. 59. PRACTICES IN BUSINESS  Monitoring a sector or a product  Studying a community and identifying leaders  Studying e-reputation on the social web  Studying of spontaneous conversations around a brand  Studying the viral spread of content…..
  60. 60. MAPPING OF A COMMUNITY: GITHUB, SOCIAL NETWORKS, OPEN SOURCE DEVELOPERS Auteur : linkfluence.net
  61. 61. MAPPING A SECTOR: ACTORS OF URANIUM MINING IN AFRICA Auteur : Susana Nuneshttp://www.hellodata.eu/uranium/
  62. 62. Auteur : Susana Nunes
  63. 63. MAPPING A SECTOR: INVESTORS IN NEW TECHNOLOGIES (2010) Auteur : linkfluence.net
  64. 64. TEACHING WEB MAPPING Exercice : « How to promote a web portal for illustrators to develop this activity? »
  65. 65. Exercice : « Discursive communities related to hipsters in France» TEACHING WEB MAPPING
  66. 66. EXEMPLES Severo M. (2012), « Le patrimoine culturel immatériel sur la Toile. Comparaison entre réseaux nationaux », in Culture et recherche, n. 127, p. 58-57 http://www.culturecommunication.gouv.fr/content/download/53634/415776/file/ Culture%20et%20recherche%20127_automne%202012.pdf
  67. 67. BAGUALA PROJECT  http://baguala.hypotheses.org  Project coordinated by Pierre Gautreau, Université de Paris 1 See article :  P. Gautreau, H. Hasenack, L. Lerch, G. Merlinksy, M. Noucher, M. Severo, « Comparison of the open environmental data diffusion in Argentina, Bolivia and Brazil », http://hal.archives-ouvertes.fr/hal-00744805
  68. 68. THE GOALS  Studying the uses of open environmental data in Latin America and France  Understanding how Internet changes the ways the Society represents and manages its environment, through the supply of information and data online  Building an inventory of websites that provide information or data about the environment in Argentina, Bolivia and Brazil
  69. 69. 1. SELECTION OF WEBSITES  275 requests to the Google search engine, for each of the three countries. For each country, the list of requests was established combining the name of an administrative unit, an environmental keyword (environment, nature, environmental education, biodiversity, pollution, water, climatic change, environnemental risk, waste, soil, forest)  For each request, the first 50 answers were examined, and the pertinent websites were included to our corpus
  70. 70. 2. CATEGORIZATION   Each website has benne visited and described through 30 categories, aiming to characterize its author, the author of its content (frequently not the same one as the site’s author), the objectives of the site, its main theme, and the kind of data it supplied
  71. 71. 3. MAPPING THE WEB  we searched for the hyperlinks between these sites through a “webcrawling”, and represented the graph they formed  ATTENTION: For a correct evaluation of the results that will be discussed, it is important to remember that this inventory is only a picture of the web of the period when it was performed (the mid-2012 year), that will be quickly outdated due to the rapid changes of the environmental sites.
  72. 72. ACTIVIST WEBSITES
  73. 73. 2 TYPES OF ACTIVIST WEBSITES  information hubs, their protest action is aimed at spreading knowledge. In this category we find primarily alternative media (indymedia.argentina.org, rebellion.org, www.adital.com.br) and social movements  sites of movements that focus on the organization of activities in the physical space. Their protest is geographically and thematically focused

×