SlideShare a Scribd company logo
Mapping french Open
Data actors on the web
with Common Crawl
guillaume.lebourgeois@data-publica.com
@glebourg
Mining the Web at Data Publica
Different needs, different techniques
   ● Scraping
   ● Focused crawling
   ● Prospective crawling
Mining the Web at Data Publica
Scraping
  ● Identified resources
  ● Configured extractors
  ● Structured content
  ● Not scalable
Mining the Web at Data Publica
Focused crawling
  ● Identified entities
  ● Fuzzy extraction
  ● Structured content using text-mining
  ● Scalable
  ● Useful to get meta information on known
    entities
Mining the Web at Data Publica
Prospective crawling
  ● No starting point
  ● Fuzzy extraction
  ● Structured content using text-mining
  ● Very hard to scale
  ● Heavy resources needed : CPU, RAM,
    HDD

It makes your life easier to use a third-party !
From a crawl to a map
Goal : build a map of the french open data
actors on the web
  ● As a graph
  ● Showing websites
From a crawl to a map
Using Common Crawl
  ● Large web crawl archives fully accessible
  ● Good coverage of french web
  ● Easy access via AWS / MapReduce jobs
From a crawl to a map
Working on french web
 ● Irrelevant to use tld .fr for detection
 ● Detecting page language
 ● Giving websites a "frenchness" score
     ○ Sw = amount of fr pages / total of pages
     ○ Cutoff manually chosen via testing on french
       websites
From a crawl to a map
Working on Open Data websites
 ● Building an Open Data "vocabulary"
 ● Detecting if page speaks about Open
    Data
 ● Giving websites an "opendataness" score
     ○ Sw = amount of Open Data pages / total of pages
     ○ Cutoff manually chosen via testing on Open Data
       websites
From a crawl to a map
Building graph
  ● Inside our subset
     ○ Inlinks
     ○ Outlinks
  ● Generating two files
     ○ nodes.csv (list of websites with an id)
     ○ edges.csv (directed links between websites)


              A inlink                A outlink
                             Node A



                  A inlink
From a crawl to a map
Building graph
  ● Links tell a lot about websites
     ○ Authorities
     ○ Hubs
From a crawl to a map
Visualizing graph using Gephi
  ● Load graph
  ● Spatialize graph
     ○ links between websites create "attraction", to
       make them appear near each other
     ○ the more inlinks, bigger the node (= authority)
     ○ categorizing web site for better understanding (a
       color per category)
        ■ Companies, Non profit/blogs, Governement
           agencies
     ○ communities can now appear !
From a crawl to a map
From a crawl to a map
Visualizing graph on the web
  ● Sigma.js
  ● Uses Gephi files
  ● Gives better interactivity
Analyze
● The final graph is a good way to understand
  interactions between actors
  ○ Open Data is definitely initiated by a Non Profit
    movement
  ○ Companies are beginning to work on the subject
  ○ French state only had some sporadic initiatives for
    now
● This graph is to be generated again in near
  futur, to see changes in this ecosystem
Results
● Large scale crawl made easy
  ○ Easy to focus on mining the results instead of
    finding/storing the data
● Nice workflow from raw data to an
  understandable visualisation
● The final graph is a good way to understand
  interactions between actors
Feedback
● Common Crawl
  ○ Common crawl doesn't have an exhaustive crawl of
    the french web for now
  ○ Data is not fresh as it could be
  ○ It is missing an index to access at least domains,
    and maybe pages in O(1)
● Methodology
  ○ Opendataness scoring can put aside some websites
    not enough focused on open data even if relevant
Resources
● http://webatlas.
  fr/tempshare/OpenDataActeursTypes.pdf
   ○ poster by Franck Ghitalla
● http://french-opendata.data-publica.
  com/index.html
   ○ dynamic visualisation of the results, by Data Publica
● http://fr.slideshare.net/willounet/a-sneak-
  peek-into-the-web-presentation,
   ○ A sneak peek into the web, by GL
● http://french-opendata.data-publica.com/
   ○ Project host page
Mapping french Open
Data actors on the web
with Common Crawl
guillaume.lebourgeois@data-publica.com
@glebourg

More Related Content

What's hot

Common Crawl: An Open Repository of Web Data
Common Crawl: An Open Repository of Web DataCommon Crawl: An Open Repository of Web Data
Common Crawl: An Open Repository of Web Data
huguk
 
온톨로지 개념 및 표현언어
온톨로지 개념 및 표현언어온톨로지 개념 및 표현언어
온톨로지 개념 및 표현언어
Dongbum Kim
 
html forms and server side scripting
html forms and server side scriptinghtml forms and server side scripting
html forms and server side scripting
bantamlak dejene
 
Web Scraping and Data Extraction Service
Web Scraping and Data Extraction ServiceWeb Scraping and Data Extraction Service
Web Scraping and Data Extraction Service
PromptCloud
 
Bigquery 101
Bigquery 101Bigquery 101
Bigquery 101
Cesar Orozco Manotas
 
Scaling Credible Content
Scaling Credible ContentScaling Credible Content
Scaling Credible Content
Joe Griffin
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With Python
Robert Dempsey
 
Building a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and OntologiesBuilding a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and Ontologies
Neo4j
 
Introduction to web scraping
Introduction to web scrapingIntroduction to web scraping
Introduction to web scraping
Dario Cottafava
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
ScyllaDB
 
Building an Enterprise Knowledge Graph @Uber: Lessons from Reality
Building an Enterprise Knowledge Graph @Uber: Lessons from RealityBuilding an Enterprise Knowledge Graph @Uber: Lessons from Reality
Building an Enterprise Knowledge Graph @Uber: Lessons from Reality
Joshua Shinavier
 
Knowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based SearchKnowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based Search
Neo4j
 
AWSKRUG DS - 데이터 엔지니어가 실무에서 맞닥뜨리는 문제들
AWSKRUG DS - 데이터 엔지니어가 실무에서 맞닥뜨리는 문제들AWSKRUG DS - 데이터 엔지니어가 실무에서 맞닥뜨리는 문제들
AWSKRUG DS - 데이터 엔지니어가 실무에서 맞닥뜨리는 문제들
Woong Seok Kang
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
Max De Marzi
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
Open Data Support
 
온톨로지 추론 개요
온톨로지 추론 개요온톨로지 추론 개요
온톨로지 추론 개요
Sang-Kyun Kim
 
Web Scraping Basics
Web Scraping BasicsWeb Scraping Basics
Web Scraping Basics
Kyle Banerjee
 
How search engine work ppt
How search engine work pptHow search engine work ppt
How search engine work ppt
Shubham Chinchkar
 
Intro to Neo4j - Nicole White
Intro to Neo4j - Nicole WhiteIntro to Neo4j - Nicole White
Intro to Neo4j - Nicole White
Neo4j
 
Improving Machine Learning using Graph Algorithms
Improving Machine Learning using Graph AlgorithmsImproving Machine Learning using Graph Algorithms
Improving Machine Learning using Graph Algorithms
Neo4j
 

What's hot (20)

Common Crawl: An Open Repository of Web Data
Common Crawl: An Open Repository of Web DataCommon Crawl: An Open Repository of Web Data
Common Crawl: An Open Repository of Web Data
 
온톨로지 개념 및 표현언어
온톨로지 개념 및 표현언어온톨로지 개념 및 표현언어
온톨로지 개념 및 표현언어
 
html forms and server side scripting
html forms and server side scriptinghtml forms and server side scripting
html forms and server side scripting
 
Web Scraping and Data Extraction Service
Web Scraping and Data Extraction ServiceWeb Scraping and Data Extraction Service
Web Scraping and Data Extraction Service
 
Bigquery 101
Bigquery 101Bigquery 101
Bigquery 101
 
Scaling Credible Content
Scaling Credible ContentScaling Credible Content
Scaling Credible Content
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With Python
 
Building a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and OntologiesBuilding a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and Ontologies
 
Introduction to web scraping
Introduction to web scrapingIntroduction to web scraping
Introduction to web scraping
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
 
Building an Enterprise Knowledge Graph @Uber: Lessons from Reality
Building an Enterprise Knowledge Graph @Uber: Lessons from RealityBuilding an Enterprise Knowledge Graph @Uber: Lessons from Reality
Building an Enterprise Knowledge Graph @Uber: Lessons from Reality
 
Knowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based SearchKnowledge Graphs - The Power of Graph-Based Search
Knowledge Graphs - The Power of Graph-Based Search
 
AWSKRUG DS - 데이터 엔지니어가 실무에서 맞닥뜨리는 문제들
AWSKRUG DS - 데이터 엔지니어가 실무에서 맞닥뜨리는 문제들AWSKRUG DS - 데이터 엔지니어가 실무에서 맞닥뜨리는 문제들
AWSKRUG DS - 데이터 엔지니어가 실무에서 맞닥뜨리는 문제들
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
온톨로지 추론 개요
온톨로지 추론 개요온톨로지 추론 개요
온톨로지 추론 개요
 
Web Scraping Basics
Web Scraping BasicsWeb Scraping Basics
Web Scraping Basics
 
How search engine work ppt
How search engine work pptHow search engine work ppt
How search engine work ppt
 
Intro to Neo4j - Nicole White
Intro to Neo4j - Nicole WhiteIntro to Neo4j - Nicole White
Intro to Neo4j - Nicole White
 
Improving Machine Learning using Graph Algorithms
Improving Machine Learning using Graph AlgorithmsImproving Machine Learning using Graph Algorithms
Improving Machine Learning using Graph Algorithms
 

Similar to Mapping french open data actors on the web with common crawl

How and why governments should use OpenStreetMap - Pete Lancaster - State of ...
How and why governments should use OpenStreetMap - Pete Lancaster - State of ...How and why governments should use OpenStreetMap - Pete Lancaster - State of ...
How and why governments should use OpenStreetMap - Pete Lancaster - State of ...
OSMFstateofthemap
 
Amsterdam developing public code for every city and everyone, Boris Van Hoyte...
Amsterdam developing public code for every city and everyone, Boris Van Hoyte...Amsterdam developing public code for every city and everyone, Boris Van Hoyte...
Amsterdam developing public code for every city and everyone, Boris Van Hoyte...
OW2
 
Open Source Summit Paris '17 Amsterdam Open Source
Open Source Summit Paris '17 Amsterdam Open SourceOpen Source Summit Paris '17 Amsterdam Open Source
Open Source Summit Paris '17 Amsterdam Open Source
Boris van Hoytema
 
DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014
Dimitris Kontokostas
 
Linking knowledge spaces
Linking knowledge spacesLinking knowledge spaces
Linking knowledge spaces
Christophe Guéret
 
City of Amsterdam: High velocity development
City of Amsterdam: High velocity developmentCity of Amsterdam: High velocity development
City of Amsterdam: High velocity development
Boris van Hoytema
 
Web Scraping_ Gathering Data from Websites.pptx
Web Scraping_ Gathering Data from Websites.pptxWeb Scraping_ Gathering Data from Websites.pptx
Web Scraping_ Gathering Data from Websites.pptx
HitechIOT
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB
 
SUNY Purchase Social Media Certificate Program - Session 4
SUNY Purchase Social Media Certificate Program - Session 4SUNY Purchase Social Media Certificate Program - Session 4
SUNY Purchase Social Media Certificate Program - Session 4
Bridget Gibbons
 
Local Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionLocal Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell Extension
Sammy Fung
 
OutSystems Webinar - Troubleshooting Mobile Apps Performance
OutSystems Webinar - Troubleshooting Mobile Apps PerformanceOutSystems Webinar - Troubleshooting Mobile Apps Performance
OutSystems Webinar - Troubleshooting Mobile Apps Performance
Daniel Reis
 
Training Webinar: Troubleshooting Mobile Apps Performance
Training Webinar: Troubleshooting Mobile Apps Performance Training Webinar: Troubleshooting Mobile Apps Performance
Training Webinar: Troubleshooting Mobile Apps Performance
OutSystems
 
What’s next in mapping for portals? ppw2012
What’s next in mapping for portals? ppw2012What’s next in mapping for portals? ppw2012
What’s next in mapping for portals? ppw2012
lokku
 
marc portier_westtoer
marc portier_westtoermarc portier_westtoer
marc portier_westtoer
Katrien Steelandt
 
Open streetmapによる鳥取ガイドの試み3
Open streetmapによる鳥取ガイドの試み3Open streetmapによる鳥取ガイドの試み3
Open streetmapによる鳥取ガイドの試み3
Hiroyuki Nakaji
 
India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015
Kanwal Prakash Singh
 
India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015
Kanwal Prakash Singh
 
Tools for Visualizing Geospatial Data in a Web Browser
Tools for Visualizing Geospatial Data in a Web BrowserTools for Visualizing Geospatial Data in a Web Browser
Tools for Visualizing Geospatial Data in a Web Browser
Safe Software
 
Recommender Hackathon @plista 2013/04
Recommender Hackathon @plista 2013/04Recommender Hackathon @plista 2013/04
Recommender Hackathon @plista 2013/04
Torben Brodt
 
OER World Map Project
OER World Map Project OER World Map Project
OER World Map Project
Robert Farrow
 

Similar to Mapping french open data actors on the web with common crawl (20)

How and why governments should use OpenStreetMap - Pete Lancaster - State of ...
How and why governments should use OpenStreetMap - Pete Lancaster - State of ...How and why governments should use OpenStreetMap - Pete Lancaster - State of ...
How and why governments should use OpenStreetMap - Pete Lancaster - State of ...
 
Amsterdam developing public code for every city and everyone, Boris Van Hoyte...
Amsterdam developing public code for every city and everyone, Boris Van Hoyte...Amsterdam developing public code for every city and everyone, Boris Van Hoyte...
Amsterdam developing public code for every city and everyone, Boris Van Hoyte...
 
Open Source Summit Paris '17 Amsterdam Open Source
Open Source Summit Paris '17 Amsterdam Open SourceOpen Source Summit Paris '17 Amsterdam Open Source
Open Source Summit Paris '17 Amsterdam Open Source
 
DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014
 
Linking knowledge spaces
Linking knowledge spacesLinking knowledge spaces
Linking knowledge spaces
 
City of Amsterdam: High velocity development
City of Amsterdam: High velocity developmentCity of Amsterdam: High velocity development
City of Amsterdam: High velocity development
 
Web Scraping_ Gathering Data from Websites.pptx
Web Scraping_ Gathering Data from Websites.pptxWeb Scraping_ Gathering Data from Websites.pptx
Web Scraping_ Gathering Data from Websites.pptx
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
 
SUNY Purchase Social Media Certificate Program - Session 4
SUNY Purchase Social Media Certificate Program - Session 4SUNY Purchase Social Media Certificate Program - Session 4
SUNY Purchase Social Media Certificate Program - Session 4
 
Local Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionLocal Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell Extension
 
OutSystems Webinar - Troubleshooting Mobile Apps Performance
OutSystems Webinar - Troubleshooting Mobile Apps PerformanceOutSystems Webinar - Troubleshooting Mobile Apps Performance
OutSystems Webinar - Troubleshooting Mobile Apps Performance
 
Training Webinar: Troubleshooting Mobile Apps Performance
Training Webinar: Troubleshooting Mobile Apps Performance Training Webinar: Troubleshooting Mobile Apps Performance
Training Webinar: Troubleshooting Mobile Apps Performance
 
What’s next in mapping for portals? ppw2012
What’s next in mapping for portals? ppw2012What’s next in mapping for portals? ppw2012
What’s next in mapping for portals? ppw2012
 
marc portier_westtoer
marc portier_westtoermarc portier_westtoer
marc portier_westtoer
 
Open streetmapによる鳥取ガイドの試み3
Open streetmapによる鳥取ガイドの試み3Open streetmapによる鳥取ガイドの試み3
Open streetmapによる鳥取ガイドの試み3
 
India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015
 
India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015
 
Tools for Visualizing Geospatial Data in a Web Browser
Tools for Visualizing Geospatial Data in a Web BrowserTools for Visualizing Geospatial Data in a Web Browser
Tools for Visualizing Geospatial Data in a Web Browser
 
Recommender Hackathon @plista 2013/04
Recommender Hackathon @plista 2013/04Recommender Hackathon @plista 2013/04
Recommender Hackathon @plista 2013/04
 
OER World Map Project
OER World Map Project OER World Map Project
OER World Map Project
 

More from data publica

Open data Websmatch
Open data WebsmatchOpen data Websmatch
Open data Websmatch
data publica
 
Web smatch wod2012
Web smatch wod2012Web smatch wod2012
Web smatch wod2012
data publica
 
Open source vs. open data
Open source vs. open dataOpen source vs. open data
Open source vs. open data
data publica
 
Suez environnement frédéric charles
Suez environnement frédéric charlesSuez environnement frédéric charles
Suez environnement frédéric charles
data publica
 
Tinyclues david bessis
Tinyclues david bessisTinyclues david bessis
Tinyclues david bessis
data publica
 
Treerank richard drai
Treerank richard draiTreerank richard drai
Treerank richard drai
data publica
 
Hurence
HurenceHurence
Hurence
data publica
 

More from data publica (12)

Open data Websmatch
Open data WebsmatchOpen data Websmatch
Open data Websmatch
 
Web smatch wod2012
Web smatch wod2012Web smatch wod2012
Web smatch wod2012
 
Open source vs. open data
Open source vs. open dataOpen source vs. open data
Open source vs. open data
 
Suez environnement frédéric charles
Suez environnement frédéric charlesSuez environnement frédéric charles
Suez environnement frédéric charles
 
Tinyclues david bessis
Tinyclues david bessisTinyclues david bessis
Tinyclues david bessis
 
Treerank richard drai
Treerank richard draiTreerank richard drai
Treerank richard drai
 
Bime analytics
Bime analyticsBime analytics
Bime analytics
 
Cours emi cfd
Cours emi cfdCours emi cfd
Cours emi cfd
 
Utc data publica1
Utc data publica1Utc data publica1
Utc data publica1
 
Pikko
PikkoPikko
Pikko
 
Isthma
IsthmaIsthma
Isthma
 
Hurence
HurenceHurence
Hurence
 

Mapping french open data actors on the web with common crawl

  • 1. Mapping french Open Data actors on the web with Common Crawl guillaume.lebourgeois@data-publica.com @glebourg
  • 2. Mining the Web at Data Publica Different needs, different techniques ● Scraping ● Focused crawling ● Prospective crawling
  • 3. Mining the Web at Data Publica Scraping ● Identified resources ● Configured extractors ● Structured content ● Not scalable
  • 4. Mining the Web at Data Publica Focused crawling ● Identified entities ● Fuzzy extraction ● Structured content using text-mining ● Scalable ● Useful to get meta information on known entities
  • 5. Mining the Web at Data Publica Prospective crawling ● No starting point ● Fuzzy extraction ● Structured content using text-mining ● Very hard to scale ● Heavy resources needed : CPU, RAM, HDD It makes your life easier to use a third-party !
  • 6. From a crawl to a map Goal : build a map of the french open data actors on the web ● As a graph ● Showing websites
  • 7. From a crawl to a map Using Common Crawl ● Large web crawl archives fully accessible ● Good coverage of french web ● Easy access via AWS / MapReduce jobs
  • 8. From a crawl to a map Working on french web ● Irrelevant to use tld .fr for detection ● Detecting page language ● Giving websites a "frenchness" score ○ Sw = amount of fr pages / total of pages ○ Cutoff manually chosen via testing on french websites
  • 9. From a crawl to a map Working on Open Data websites ● Building an Open Data "vocabulary" ● Detecting if page speaks about Open Data ● Giving websites an "opendataness" score ○ Sw = amount of Open Data pages / total of pages ○ Cutoff manually chosen via testing on Open Data websites
  • 10. From a crawl to a map Building graph ● Inside our subset ○ Inlinks ○ Outlinks ● Generating two files ○ nodes.csv (list of websites with an id) ○ edges.csv (directed links between websites) A inlink A outlink Node A A inlink
  • 11. From a crawl to a map Building graph ● Links tell a lot about websites ○ Authorities ○ Hubs
  • 12. From a crawl to a map Visualizing graph using Gephi ● Load graph ● Spatialize graph ○ links between websites create "attraction", to make them appear near each other ○ the more inlinks, bigger the node (= authority) ○ categorizing web site for better understanding (a color per category) ■ Companies, Non profit/blogs, Governement agencies ○ communities can now appear !
  • 13. From a crawl to a map
  • 14. From a crawl to a map Visualizing graph on the web ● Sigma.js ● Uses Gephi files ● Gives better interactivity
  • 15. Analyze ● The final graph is a good way to understand interactions between actors ○ Open Data is definitely initiated by a Non Profit movement ○ Companies are beginning to work on the subject ○ French state only had some sporadic initiatives for now ● This graph is to be generated again in near futur, to see changes in this ecosystem
  • 16. Results ● Large scale crawl made easy ○ Easy to focus on mining the results instead of finding/storing the data ● Nice workflow from raw data to an understandable visualisation ● The final graph is a good way to understand interactions between actors
  • 17. Feedback ● Common Crawl ○ Common crawl doesn't have an exhaustive crawl of the french web for now ○ Data is not fresh as it could be ○ It is missing an index to access at least domains, and maybe pages in O(1) ● Methodology ○ Opendataness scoring can put aside some websites not enough focused on open data even if relevant
  • 18. Resources ● http://webatlas. fr/tempshare/OpenDataActeursTypes.pdf ○ poster by Franck Ghitalla ● http://french-opendata.data-publica. com/index.html ○ dynamic visualisation of the results, by Data Publica ● http://fr.slideshare.net/willounet/a-sneak- peek-into-the-web-presentation, ○ A sneak peek into the web, by GL ● http://french-opendata.data-publica.com/ ○ Project host page
  • 19. Mapping french Open Data actors on the web with Common Crawl guillaume.lebourgeois@data-publica.com @glebourg