This document discusses Istat's efforts to promote open data through its Linked Open Data portal. It provides an overview of the open data movement and benefits of open data ecosystems. The document then describes Istat's Linked Open Data portal, including its features for navigating, searching, querying, and visualizing open data. Examples of use cases that leverage the portal for spatial querying across datasets and federated querying across multiple open data portals are also presented. The conclusion reinforces that a dissemination strategy based on open data can better serve users by making data more accessible and comparable.
Open Data Analytics for Parliamentary Monitoring in FinlandLouhos
The document discusses developing open analytics tools for parliamentary data in Finland. It notes that a lack of tools is hindering access to and monitoring of parliamentary data. Developing flexible research and analysis tools will help realize the full potential of new open government information resources. The Louhos repository aims to develop code for accessing hundreds of Finnish data sources and apply new research tools to monitor decision making. General purpose software like the R library SoRvi will integrate open data, algorithms, and applications to enable analyses like topic modeling of parliamentary debates. The goal is to build sustainable infrastructure for parliamentary monitoring through collaborations between individuals, organizations, and media.
Barbieri De Francisci_Dealing with Open Data at ISTATGiovanni Barbieri
Dealing with Open Data at ISTAT. First Steps Towards a Perfect Data Portal: a Cutting-edge Approach for Dealing with Open Data
IAOS, Abu Dhabi, December 6, 2016
Open Data and the transparency of the lists of beneficiaries of EU Regional P...OpenCoesione
This document discusses open data and transparency regarding lists of beneficiaries of EU regional policy funding. It summarizes the results of surveys conducted in 2010-2012 that evaluated how openly and usefully different EU countries and regions published this funding data. The surveys found three main strategies for publishing the data and that over time more regions/countries shifted towards a balanced approach considering both data openness and usability. Open questions remain regarding factors influencing transparency and how to best promote civic engagement with the data.
What is linked data
What is open data
What is the difference between linked and open data
How to publish linked data (5-star schema)
The economic and social aspects of linked data.
This document summarizes the work of Slim Turki and Prune Gautier on open data and data ecosystems since 2012. It discusses their projects on open data quality, how open data is turned into services, and data ecosystem governance. It also outlines recommendations for establishing sustainable data ecosystems, including collaborative governance, stakeholder engagement, technical standards, and economic sustainability. Finally, it notes trends moving from open data provision to ecosystem thinking with high value datasets, the European Data Strategy, and opportunities around health, environment, and urban digital twins data.
Census microdata from different countries and time periods is currently difficult to access, combine, and analyze due to differences in format and granularity. The authors propose applying Linked Open Data principles and semantic web technologies to publish census microdata in order to address these issues. They present a process for converting census microdata to Linked Open Data and apply it to two case studies: the 2001 Spanish census and the Integrated Public Use Microdata Series International framework. The results show census microdata can be published as Linked Open Data while preserving original structures, and this approach enables harmonization and integration across data sources.
Open Data: Barriers, Risks, and OpportunitiesSlim Turki, Dr.
Despite the development of Open Data platforms, the wider deployment of Open Data still faces significant barriers. It requires identifying the obstacles that have prevented e-government bodies either from implementing an Open Data strategy or from ensuring its sustainability.
This paper presents the results of a study carried out between June and November 2012, in which we analyzed three cases of Open Data development through their platforms, in a medium size city (Rennes, France), a large city (Berlin, Germany), and at national level (UK). It aims to draw a clear typology of challenges, risks, limitations and barriers related to Open Data. Indeed the issues and constraints faced by re-users of public data differ from the ones encountered by the public data providers. Through the analysis of the experiences in opening data, we attempt to identify how barriers were overcome and how risks were managed. Beyond passionate debates in favor or against Open Data, we propose to consider the development of an Open Data initiative in terms of risks, contingency actions, and expected opportunities. We therefore present in this paper the risks to Open Data organized in 7 categories: (1) governance, (2) economic issues, (3) licenses and legal frameworks, (4) data characteristics, (5) metadata, (6) access, and (7) skills.
Sébastien Martin 1, Muriel Foulonneau 2, Slim Turki 2, Madjid Ihadjadene 1
1 Université Paris 8, Vincennes-Saint-Denis, France
2 PRC Henri Tudor, Luxembourg, Luxembourg
This document summarizes a decade of research into online election campaigns in Scotland from 2003 to 2013. The research analyzed party and candidate websites and social media presence during election periods. It found that while the use of online technologies has grown, parties and candidates primarily use them for one-way broadcasting of information rather than meaningful engagement with voters. Surveys of voters found they were unimpressed with the superficial social media efforts and that online information had little influence on their voting decisions. The research concludes that while technologies have advanced, the basic nature of online campaigning in Scotland remains focused on broadcasting rather than two-way dialogue.
Open Data Analytics for Parliamentary Monitoring in FinlandLouhos
The document discusses developing open analytics tools for parliamentary data in Finland. It notes that a lack of tools is hindering access to and monitoring of parliamentary data. Developing flexible research and analysis tools will help realize the full potential of new open government information resources. The Louhos repository aims to develop code for accessing hundreds of Finnish data sources and apply new research tools to monitor decision making. General purpose software like the R library SoRvi will integrate open data, algorithms, and applications to enable analyses like topic modeling of parliamentary debates. The goal is to build sustainable infrastructure for parliamentary monitoring through collaborations between individuals, organizations, and media.
Barbieri De Francisci_Dealing with Open Data at ISTATGiovanni Barbieri
Dealing with Open Data at ISTAT. First Steps Towards a Perfect Data Portal: a Cutting-edge Approach for Dealing with Open Data
IAOS, Abu Dhabi, December 6, 2016
Open Data and the transparency of the lists of beneficiaries of EU Regional P...OpenCoesione
This document discusses open data and transparency regarding lists of beneficiaries of EU regional policy funding. It summarizes the results of surveys conducted in 2010-2012 that evaluated how openly and usefully different EU countries and regions published this funding data. The surveys found three main strategies for publishing the data and that over time more regions/countries shifted towards a balanced approach considering both data openness and usability. Open questions remain regarding factors influencing transparency and how to best promote civic engagement with the data.
What is linked data
What is open data
What is the difference between linked and open data
How to publish linked data (5-star schema)
The economic and social aspects of linked data.
This document summarizes the work of Slim Turki and Prune Gautier on open data and data ecosystems since 2012. It discusses their projects on open data quality, how open data is turned into services, and data ecosystem governance. It also outlines recommendations for establishing sustainable data ecosystems, including collaborative governance, stakeholder engagement, technical standards, and economic sustainability. Finally, it notes trends moving from open data provision to ecosystem thinking with high value datasets, the European Data Strategy, and opportunities around health, environment, and urban digital twins data.
Census microdata from different countries and time periods is currently difficult to access, combine, and analyze due to differences in format and granularity. The authors propose applying Linked Open Data principles and semantic web technologies to publish census microdata in order to address these issues. They present a process for converting census microdata to Linked Open Data and apply it to two case studies: the 2001 Spanish census and the Integrated Public Use Microdata Series International framework. The results show census microdata can be published as Linked Open Data while preserving original structures, and this approach enables harmonization and integration across data sources.
Open Data: Barriers, Risks, and OpportunitiesSlim Turki, Dr.
Despite the development of Open Data platforms, the wider deployment of Open Data still faces significant barriers. It requires identifying the obstacles that have prevented e-government bodies either from implementing an Open Data strategy or from ensuring its sustainability.
This paper presents the results of a study carried out between June and November 2012, in which we analyzed three cases of Open Data development through their platforms, in a medium size city (Rennes, France), a large city (Berlin, Germany), and at national level (UK). It aims to draw a clear typology of challenges, risks, limitations and barriers related to Open Data. Indeed the issues and constraints faced by re-users of public data differ from the ones encountered by the public data providers. Through the analysis of the experiences in opening data, we attempt to identify how barriers were overcome and how risks were managed. Beyond passionate debates in favor or against Open Data, we propose to consider the development of an Open Data initiative in terms of risks, contingency actions, and expected opportunities. We therefore present in this paper the risks to Open Data organized in 7 categories: (1) governance, (2) economic issues, (3) licenses and legal frameworks, (4) data characteristics, (5) metadata, (6) access, and (7) skills.
Sébastien Martin 1, Muriel Foulonneau 2, Slim Turki 2, Madjid Ihadjadene 1
1 Université Paris 8, Vincennes-Saint-Denis, France
2 PRC Henri Tudor, Luxembourg, Luxembourg
This document summarizes a decade of research into online election campaigns in Scotland from 2003 to 2013. The research analyzed party and candidate websites and social media presence during election periods. It found that while the use of online technologies has grown, parties and candidates primarily use them for one-way broadcasting of information rather than meaningful engagement with voters. Surveys of voters found they were unimpressed with the superficial social media efforts and that online information had little influence on their voting decisions. The research concludes that while technologies have advanced, the basic nature of online campaigning in Scotland remains focused on broadcasting rather than two-way dialogue.
This document discusses opportunities for Minnesota state government to leverage big data and information technology. It notes that while the state collects large amounts of data across multiple systems, the data is not well integrated. The next step is to begin aggregating data from different systems to create value. This presents both technical challenges in data integration and governance challenges in responsibly handling citizens' data. Examples are given of using big data for human resources, public services, and citizen outreach. Minnesota has also issued an RFP and received funding to develop data analytics and a statewide longitudinal data system.
Proposal for Designing a Linked Data Migrational Framework for Singapore Gove...Aravind Sesagiri Raamkumar
This proposal suggests a framework for migrating Singapore government data sets to a linked data format. It involves studying Singapore's current open government data ecosystem and selected data sets from the Department of Statistics and Singapore Land Authority. The objectives are to design a multi-step methodology for publishing linked open government data, build an ontology network to unify agency vocabularies, and validate the framework. Key challenges include agencies not providing raw data and a lack of standardization across master data dimensions. The proposed report outline will cover linked data fundamentals, case studies, and recommendations for implementing the framework.
Open Government Data on the Web - A Semantic ApproachPeter Krantz
(upload with permission from Armand Brahaj)
Initiatives of making governmental data open are continuously gaining interest recently. While this presents immense benefits for increasing transparency, the problem is that the data are frequently offered in heterogeneous formats, missing clear semantics that clarify what the data describes. The data are displayed in ways that are not always clearly understandable to a broad range of user communities that need to make informed decisions.
The document discusses four levels of interoperability - technical, semantic, organizational, and legal - that are important for open data systems. Technical interoperability involves combining data from different sources and formats, while semantic interoperability requires clarifying meanings and metadata. Organizational interoperability concerns aligning data collection between sources. Legal interoperability necessitates using established open licenses to allow data reuse without legal barriers. Achieving interoperability across these levels maximizes the value of open data.
This document discusses data mining techniques for social media. It begins by reviewing the growth of popular social media sites like Facebook, YouTube, and Twitter. It then discusses how social media generates huge amounts of user data through interactions and content sharing. The document outlines opportunities to use data mining on social networks to gain insights into human behavior, marketing analytics, and more. It reviews common problems studied, like community detection, node classification, and modeling information flow. The conclusion emphasizes that social media provides a massive, open dataset for developing recommender systems and targeting marketing through predictive analysis of user interests and trends.
This proposal outlines a framework for migrating Singapore government data sets to a linked data format. It aims to standardize data representation and link datasets across agencies to improve access, reuse and knowledge discovery. The objectives are to explore linked open government data case studies, assess linked data tools, and provide recommendations for implementation. The framework will validate the utility of linking geographical and statistical data. It seeks to unify agency vocabularies and simplify access through a common publication interface. The framework could guide Singapore's adoption of linked open data and help other entities publish their data on the semantic web.
1) Open data is adding a new dimension to big data analytics and data-driven innovations. Official statistics can more easily reach a wide range of users, like citizens, journalists, and educators, if conveyed through open data.
2) Istat has developed a Linked Open Data portal to make its statistical data openly available in accordance with semantic web standards. This allows for spatial querying of data and federated querying across different data sources.
3) The portal serves as an open data provider, dynamically integrating social platforms to allow discussion around visualizations of census data. An open data dissemination strategy places users at the center by reaching them through different channels and making data easier to access and enrich.
This document discusses data mining in social networks. It covers topics like social network analysis, graph mining, and text mining on social media platforms. Graph mining is used to understand relationships and extract communities from social networks. Text mining techniques like clustering and anomaly detection are applied to textual data from blogs, messages, etc. on social platforms. The document also discusses accessing Facebook data through its API and SDK, and applications and limitations of social network analysis.
2010 06-08 chania stochastic web modelling - copyvafopoulos
The document summarizes research on modeling the evolution of the World Wide Web as a complex system. It discusses the Web's structure as a directed graph and statistical properties like power law degree distributions and small world properties. It describes models of Web traffic and evolution that use concepts from statistical physics and complex networks. Game theoretic and query-based models are also summarized. The document focuses on a query-Web model that explains the Web's scale-free structure through the interaction of users, documents, and search engines.
Evolving social data mining and affective analysis Athena Vakali
Evolving social data mining and affective analysis methodologies, framework and applications - Web 2.0 facts and social data
Social associations and all kinds of graphs
Evolving social data mining
Emotion-aware social data analysis
Frameworks and Applications
The paper describes the work being conducted in the Cross-institutional Authority Collaboration (Institutionenübergreifende Integration von Normdaten, IN2N) project. This pilot project, executed in cooperation with the German National Library and the German Film Institute, aims to establish new collaboration models to improve cross-domain authority maintenance. The paper outlines applied strategies for providing a shared infrastructure as well as workflows for exchanging data about persons; interface enhancements permitting the exploitation of innovative web approaches; and cross-institutional data search and representation solutions. Furthermore, we discuss specific boundary conditions, such as disparities in the level of data granularity, for an interoperable cataloguing environment.
Open Linked Data as Part of a Government Enterprise ArchitectureJohann Höchtl
This document discusses open linked data and its role in public administration information management. It presents open linked data as a key element of transparency, participation, and collaboration in government. It outlines the benefits of open data for citizens, such as increased self-determination and better public services. However, it also notes potential drawbacks, such as loss of control and power for administrations. The document proposes an open government data architecture model based on a five-level saturation model, with data encoded in RDF and assigned persistent URIs to ensure reliability. Overall, the document argues that treating information outflow as a core part of information management can strengthen trust in government.
The Politics of Open Data: Past, Present and FutureJonathan Gray
Slides for presentation on “The Politics of Open Data: Past, Present and Future” at the Data Power conference at the University of Sheffield, 22nd June 2015.
EDF2014: Christian Lindemann, Wolters Kluwer Germany & Christian Dirschl, Wol...European Data Forum
Invited Talk by Christian Lindemann, Wolters Kluwer Germany & Christian Dirschl, Wolters Kluwer Germany at the European Data Forum 2014, 20 March 2014 in Athens, Greece: Linked Data and Open Government Data as part of the business strategy of Wolters Kluwer Germany.
Ligado nos Políticos at ESWC'2011 WorkshopPablo Mendes
Publishing Linked Data from Brazilian Politicians on the Web
Lucas de Ramos Araújo
Pablo N. Mendes
Jairo Francisco de Souza
At the Workshop on Semantics in Governance and Policy Modelling, Extended Semantic Web Conference 2011 ESWC2010.
May 30, 2011 - Crete, Greece
Farirhair.ai: AI platform to mine competitive intelligence from billions of u...Aditya Jami
My presentation at ODSC 2017. Video link: https://www.youtube.com/watch?v=Mwv6dSTYvN4&t=
AI engine for unsupervised full-site web extraction from millions of websites, supervised machine learning methods to fuse these distinct information sources and link them into our domain-specific probabilistic knowledge graph with over 275 million facts mined so far. Also shared practical learnings on how we combine traditional information extraction techniques with recent advancements in deep learning for a variety of NLP tasks such as entity-level sentiment and relation extraction on 10’s of millions of new documents a day across 15 different languages.
towards an expanded and integrated ogd agenda for india // icegov 2013 // seoulSumandro C
Slides from the paper titled 'Towards an Expanded and Integrated Open Government Data Agenda for India' presented at ICEGOV, in Seoul on October 22, 2013. Paper can be accessed here: https://github.com/ajantriks/writings/blob/master/sumandro_expanded_and_integrated_ogd_agenda_for_India.md
This document summarizes a presentation on open data in Italy given by Lorenzo Benussi. The presentation provided background on open data and big data trends. It defined key open data concepts like open knowledge definitions and open data licenses. Examples of open data portals from around the world and in Italy were presented. Challenges around open data quality, explanation and engagement were also discussed. The presentation concluded that open data has the potential to transform how information is managed, markets function, and the relationship between government and citizens.
This document discusses opportunities for Minnesota state government to leverage big data and information technology. It notes that while the state collects large amounts of data across multiple systems, the data is not well integrated. The next step is to begin aggregating data from different systems to create value. This presents both technical challenges in data integration and governance challenges in responsibly handling citizens' data. Examples are given of using big data for human resources, public services, and citizen outreach. Minnesota has also issued an RFP and received funding to develop data analytics and a statewide longitudinal data system.
Proposal for Designing a Linked Data Migrational Framework for Singapore Gove...Aravind Sesagiri Raamkumar
This proposal suggests a framework for migrating Singapore government data sets to a linked data format. It involves studying Singapore's current open government data ecosystem and selected data sets from the Department of Statistics and Singapore Land Authority. The objectives are to design a multi-step methodology for publishing linked open government data, build an ontology network to unify agency vocabularies, and validate the framework. Key challenges include agencies not providing raw data and a lack of standardization across master data dimensions. The proposed report outline will cover linked data fundamentals, case studies, and recommendations for implementing the framework.
Open Government Data on the Web - A Semantic ApproachPeter Krantz
(upload with permission from Armand Brahaj)
Initiatives of making governmental data open are continuously gaining interest recently. While this presents immense benefits for increasing transparency, the problem is that the data are frequently offered in heterogeneous formats, missing clear semantics that clarify what the data describes. The data are displayed in ways that are not always clearly understandable to a broad range of user communities that need to make informed decisions.
The document discusses four levels of interoperability - technical, semantic, organizational, and legal - that are important for open data systems. Technical interoperability involves combining data from different sources and formats, while semantic interoperability requires clarifying meanings and metadata. Organizational interoperability concerns aligning data collection between sources. Legal interoperability necessitates using established open licenses to allow data reuse without legal barriers. Achieving interoperability across these levels maximizes the value of open data.
This document discusses data mining techniques for social media. It begins by reviewing the growth of popular social media sites like Facebook, YouTube, and Twitter. It then discusses how social media generates huge amounts of user data through interactions and content sharing. The document outlines opportunities to use data mining on social networks to gain insights into human behavior, marketing analytics, and more. It reviews common problems studied, like community detection, node classification, and modeling information flow. The conclusion emphasizes that social media provides a massive, open dataset for developing recommender systems and targeting marketing through predictive analysis of user interests and trends.
This proposal outlines a framework for migrating Singapore government data sets to a linked data format. It aims to standardize data representation and link datasets across agencies to improve access, reuse and knowledge discovery. The objectives are to explore linked open government data case studies, assess linked data tools, and provide recommendations for implementation. The framework will validate the utility of linking geographical and statistical data. It seeks to unify agency vocabularies and simplify access through a common publication interface. The framework could guide Singapore's adoption of linked open data and help other entities publish their data on the semantic web.
1) Open data is adding a new dimension to big data analytics and data-driven innovations. Official statistics can more easily reach a wide range of users, like citizens, journalists, and educators, if conveyed through open data.
2) Istat has developed a Linked Open Data portal to make its statistical data openly available in accordance with semantic web standards. This allows for spatial querying of data and federated querying across different data sources.
3) The portal serves as an open data provider, dynamically integrating social platforms to allow discussion around visualizations of census data. An open data dissemination strategy places users at the center by reaching them through different channels and making data easier to access and enrich.
This document discusses data mining in social networks. It covers topics like social network analysis, graph mining, and text mining on social media platforms. Graph mining is used to understand relationships and extract communities from social networks. Text mining techniques like clustering and anomaly detection are applied to textual data from blogs, messages, etc. on social platforms. The document also discusses accessing Facebook data through its API and SDK, and applications and limitations of social network analysis.
2010 06-08 chania stochastic web modelling - copyvafopoulos
The document summarizes research on modeling the evolution of the World Wide Web as a complex system. It discusses the Web's structure as a directed graph and statistical properties like power law degree distributions and small world properties. It describes models of Web traffic and evolution that use concepts from statistical physics and complex networks. Game theoretic and query-based models are also summarized. The document focuses on a query-Web model that explains the Web's scale-free structure through the interaction of users, documents, and search engines.
Evolving social data mining and affective analysis Athena Vakali
Evolving social data mining and affective analysis methodologies, framework and applications - Web 2.0 facts and social data
Social associations and all kinds of graphs
Evolving social data mining
Emotion-aware social data analysis
Frameworks and Applications
The paper describes the work being conducted in the Cross-institutional Authority Collaboration (Institutionenübergreifende Integration von Normdaten, IN2N) project. This pilot project, executed in cooperation with the German National Library and the German Film Institute, aims to establish new collaboration models to improve cross-domain authority maintenance. The paper outlines applied strategies for providing a shared infrastructure as well as workflows for exchanging data about persons; interface enhancements permitting the exploitation of innovative web approaches; and cross-institutional data search and representation solutions. Furthermore, we discuss specific boundary conditions, such as disparities in the level of data granularity, for an interoperable cataloguing environment.
Open Linked Data as Part of a Government Enterprise ArchitectureJohann Höchtl
This document discusses open linked data and its role in public administration information management. It presents open linked data as a key element of transparency, participation, and collaboration in government. It outlines the benefits of open data for citizens, such as increased self-determination and better public services. However, it also notes potential drawbacks, such as loss of control and power for administrations. The document proposes an open government data architecture model based on a five-level saturation model, with data encoded in RDF and assigned persistent URIs to ensure reliability. Overall, the document argues that treating information outflow as a core part of information management can strengthen trust in government.
The Politics of Open Data: Past, Present and FutureJonathan Gray
Slides for presentation on “The Politics of Open Data: Past, Present and Future” at the Data Power conference at the University of Sheffield, 22nd June 2015.
EDF2014: Christian Lindemann, Wolters Kluwer Germany & Christian Dirschl, Wol...European Data Forum
Invited Talk by Christian Lindemann, Wolters Kluwer Germany & Christian Dirschl, Wolters Kluwer Germany at the European Data Forum 2014, 20 March 2014 in Athens, Greece: Linked Data and Open Government Data as part of the business strategy of Wolters Kluwer Germany.
Ligado nos Políticos at ESWC'2011 WorkshopPablo Mendes
Publishing Linked Data from Brazilian Politicians on the Web
Lucas de Ramos Araújo
Pablo N. Mendes
Jairo Francisco de Souza
At the Workshop on Semantics in Governance and Policy Modelling, Extended Semantic Web Conference 2011 ESWC2010.
May 30, 2011 - Crete, Greece
Farirhair.ai: AI platform to mine competitive intelligence from billions of u...Aditya Jami
My presentation at ODSC 2017. Video link: https://www.youtube.com/watch?v=Mwv6dSTYvN4&t=
AI engine for unsupervised full-site web extraction from millions of websites, supervised machine learning methods to fuse these distinct information sources and link them into our domain-specific probabilistic knowledge graph with over 275 million facts mined so far. Also shared practical learnings on how we combine traditional information extraction techniques with recent advancements in deep learning for a variety of NLP tasks such as entity-level sentiment and relation extraction on 10’s of millions of new documents a day across 15 different languages.
towards an expanded and integrated ogd agenda for india // icegov 2013 // seoulSumandro C
Slides from the paper titled 'Towards an Expanded and Integrated Open Government Data Agenda for India' presented at ICEGOV, in Seoul on October 22, 2013. Paper can be accessed here: https://github.com/ajantriks/writings/blob/master/sumandro_expanded_and_integrated_ogd_agenda_for_India.md
This document summarizes a presentation on open data in Italy given by Lorenzo Benussi. The presentation provided background on open data and big data trends. It defined key open data concepts like open knowledge definitions and open data licenses. Examples of open data portals from around the world and in Italy were presented. Challenges around open data quality, explanation and engagement were also discussed. The presentation concluded that open data has the potential to transform how information is managed, markets function, and the relationship between government and citizens.
Where does EU money go? Availability and quality of Open Data on the recipien...Luigi Reggi
The document discusses a study analyzing the availability and quality of open data on recipients of EU structural funds. It examines literature on open government data and its potential benefits. The study collected data from the websites of 434 EU operational programs over three time periods, analyzing 33 variables across dimensions of stewardship and usefulness. Nonlinear principal component and cluster analyses were used to identify evolving open data strategies and determine the factors influencing strategic choices.
The document discusses how the flow of information is changing as more data becomes openly available through new technologies. It describes four trends: 1) Open data is being used privately as governments and organizations make raw data freely accessible in open formats; 2) Interactive tools are making data simple to understand in real-time; 3) Data is being generated and used to support urban development and infrastructure planning; 4) Examples show how open financial and map data is empowering individuals and communities.
FIWARE Global Summit - The Digital Single Market - Benefits and Solutions for...FIWARE
Presentation by Daniele Rizzi
Principal Administrator and Policy Officer, Connecting Europe Facility Program, European Commission
FIWARE Global Summit
27-28 November 2018
Malaga, Spain
What does “BIG DATA” mean for official statistics?Vincenzo Patruno
In our modern world more and more data are generated on the web and produced by sensors in the ever growing number of electronic devices surrounding us. The amount of data and the frequency at which they are produced have led to the concept of 'Big data'. Big data is characterized as data sets of increasing volume, velocity and variety; the 3 V's. Big data is often largely unstructured, meaning that it has no pre-defined data model and/or does not fit well into conventional relational databases.
OpenTransportNet: Stimulating Innovation with Open Geographic Information21cConsultancy_2012
1) The document discusses OpenTransportNet (OTN), a European project that aims to stimulate business innovation and enhance public services by improving access to open geographic information.
2) In its first year, OTN worked to create an INSPIRE-compliant data model for transport networks and expose aggregated and harmonized transport data through virtual service hubs.
3) OTN addresses challenges of disharmonized and scattered data by bringing together spatial, dynamic, and non-spatial data sources and using techniques like metadata catalogues, data visualization tools, and privacy controls.
Web samia mehlem open data and wb main presentationGlobalForum
The document discusses open data and its benefits. Open data refers to data that is publicly available, machine-readable, and can be used, reused and redistributed without restrictions. Open data benefits governments and citizens by increasing transparency, accountability and engagement. It also enables innovation and economic growth. The document provides examples of how open data has been used to create business opportunities and jobs, improve public services, and develop apps for citizens. It emphasizes that successful open data initiatives require connecting data suppliers to users and engaging stakeholders across sectors through ongoing collaboration.
OPEN KNOWLEDGE PLATFORM USE-CASES - TugaIT 2018Pedro Sousa
Many of Open Knowledge International’s projects are technical in nature. Its most prominent project, CKAN, is used by many of the world’s governments to host open catalogues of data that their countries possess.
CKAN is a tool for making open data websites. (Think of a content management system like WordPress – but for data, instead of pages and blog posts.) It helps you manage and publish collections of data. It is used by national and local governments, research institutions, and other organizations who collect a lot of data.
In this talk I’ll go over some use-cases of Open Knowledge Platform implementations by the Portuguese Government, the architectural features, the difficulties and different approaches to solve them.
Vassilios Peristeras: From Open to Linked Government Data: (European Commissi...FIA2010
The document discusses the transition from documents to datasets to linked data in how governments make public sector information available. It describes how information has shifted from being locked in documents, to being available as datasets in various formats through data catalogs, to the emerging approach of linked open government data which will be interlinked and put in context. The document outlines European Commission actions to support this transition and increase access to public sector information through initiatives like the LOD2 and LATC projects and the SEMIC registry. It argues that linked data seems to cover business requirements and there is momentum growing around the approach.
This document summarizes a presentation on linked open government data. It discusses how government data is being opened through initiatives like Data.gov and how linked data approaches can help address challenges in making open government data more interoperable, scalable, and able to maintain provenance. Key points discussed include the growth of open government data, challenges in working with raw open data, benefits of converting data to linked open formats, and open questions around improving interoperability, addressing scalability issues, and maintaining provenance as open government data continues to expand.
Putting the L in front: from Open Data to Linked Open DataMartin Kaltenböck
Keynote presentation of Martin Kaltenböck (LOD2 project, Semantic Web Company) at the Government Linked Data Workshop in the course of the OGD Camp 2011 in Warsaw, Poland: Putting the L in front: from Open Data to Linked Open Data
Use of Open Data in Hong Kong (LegCo 2014)Sammy Fung
Presentation on use of open data in HK given to Legislative Council Secretariat. Content is mixed from my presentations at startmeup 2013 and opendatahk meetup.
Lorena Pocatilu - strategies for smart city knowledge platform and open datatu1204
The document discusses strategies for implementing smart city knowledge platforms and open data. It describes how knowledge platforms can provide access to new information, open data, connect users, and enable collaboration and innovation. As more people live in cities, knowledge platforms and open data can help manage information more efficiently to improve quality of life. Successful implementation requires addressing barriers like cultures opposed to openness and data quality problems. Open data offers opportunities to analyze and visualize data from different sources which is important for addressing societal challenges in smart cities. Several initiatives for open data are also described.
Invited talk "Open Data as a driver of Society 5.0: how you and your scientif...Anastasija Nikiforova
This presentation is prepared as a part of my talk on the openness (open data and open science) in the context of Society 5.0 during the International Conference and Expo on Nanotechnology and Nanomaterials. It was very pleasant to receive an invitation to deliver the talk on my recently published article Smarter Open Government Data for Society 5.0: Are Your Open Data Smart Enough? (Sensors 2021, 21(15), 5204), which I have entitled as “Open Data as a driver of Society 5.0: how you and your scientific outputs can contribute to the development of the Super Smart Society and transformation into Smart Living?“. The paper has been briefly discussed in my previous post, thus, just a few words on this talk and overall experience.
This document summarizes open data definitions and licensing models. It discusses what open data is, including that it is a model to extract value from public sector information by using data to build new tools and services. Open data refers to data that can be freely used, modified, and shared by anyone for any purpose. The document outlines several open data definitions and principles, including from the Open Knowledge Foundation and their Open Definition. It also discusses open data licensing models and provides examples of open government data programs from countries like the US, UK, and Australia.
The document discusses open public data and its benefits, including improving accountability, enabling economic growth, and giving users more control. It outlines principles for open data like being freely reusable and machine-readable. Examples are given of open data applications, and considerations around ensuring data is open, readable, granular, timely, and easy to find are discussed. Limitations including privacy, affordability, and consistency are also covered.
Dati, statistiche e fake news. Che fare nell'istruzione secondaria?Giovanni Barbieri
Giovanni A. Barbieri
Intervento al seminario Finanza, economia, impresa: insegnare nel contesto europeo
Università degli studi di Milano-Bicocca
14 ottobre 2019
Le parole per capire i numeri, i numeri per capire il mondoGiovanni Barbieri
Le parole per capire i numeri, i numeri per capire il mondo
Giovanni A. Barbieri
StatCities Taranto 2019
Tanto quanto. Per i livelli essenziali di statistica nel territorio
Taranto, 4-5 luglio 2019
This document discusses challenges to statistics in the era of "fake news" and "post-truth." It examines the relationship between statistics and representing reality, noting the difference between representing and measuring. While relativism would undermine statistics, model building provides a way for statistics to approximate reality through useful models rather than claiming absolute truth. The document concludes by discussing ways statistical organizations can communicate their scientific principles, including through numeracy, access to experts, trustworthiness, and fostering critical appraisal.
Giovanni A. Barbieri. Istat's Annual Report: Experimental Classification and ...Giovanni Barbieri
Istat experiments with new statistical sources and methods to meet evolving user needs and remain timely. This includes non-standard classifications that aggregate official classifications differently, and new classifications based on Istat's research. Examples discussed are groups of labor market areas, generations defined by periods of Italian history, and social groups identified through a multidimensional analysis of economic, cultural and social factors. The report also describes experimental work on social networks and their informal, family, elective, and volunteer components.
Giovanni A. Barbieri – La digitalizzazione, strategia per addomesticare l'inf...Giovanni Barbieri
Digitisation, a strategy for domesticating information
Seminar
Le infrastrutture della conoscenza nel mondo digitale / Knowledge infrastructures in the digital world
Istat, Rome, May 5, 2018
Giovanni A. Barbieri – Il fatto è la cosa più ostinata del mondoGiovanni Barbieri
Giardini Naxos, 21 luglio 2017
Costruiamo il futuro: Summer school 2017
Realismo vs. apparenza
«Nella cultura dominante l’apparenza prende il posto della realtà»
LMAs as Urban Areas: an Alternative to Italy's città metropolitaneGiovanni Barbieri
This document proposes using labor market areas (LMAs) as an alternative to define metropolitan areas in Italy, instead of using administrative provinces. It discusses some of the issues with how metropolitan areas are currently defined based on provinces. It then outlines three strategies for defining metropolitan areas based on 1) features of the urban fabric using census data, 2) public transportation networks, and 3) daily mobility flows. It provides two examples of how LMAs could better define the metropolitan areas of Cagliari and Turin compared to administrative boundaries. The document concludes by discussing how geography, like history, is constantly evolving.
Verso un sistema integrato di indicatori per le politiche localiGiovanni Barbieri
Giovanni A. Barbieri, Istat
Intervento al convegno:
Matera, 9 giugno 2017
I Comuni verso l’uso statistico degli archivi amministrativi
e dei sistemi di integrazione delle fonti
Dati e indicatori per le politiche del territorio
La domanda d’informazione statistica integrata: nuovi usi e nuovi utentiGiovanni Barbieri
La modernizzazione dei processi di produzione incrementa notevolmente l'offerta di informazione statistica integrata. Ma a quale domanda si rivolge questa offerta? A quali utenti? A quali usi? E a quali principi deve attenersi la statistica pubblica?
Scuola e mercato del lavoro: una prospettiva storica, di Giovanni A. Barbieri...Giovanni Barbieri
Scuola e mercato del lavoro: una prospettiva storica
Intervento al convegno scientifico "La scuola in Italia: i protagonisti, i fatti, i dati". Salerno, 18 ottobre 2016
Intervento a StatCities 2016.
Tre spunti a partire dal lavoro svolto dal Servizio Statistica e toponomastica del Comune di Firenze:
Diffondere la cultura dell’analisi e dell’integrazione
Le nuove geografie e le nuove prospettive
Di quali informazioni hanno bisogno i luoghi?
Disparità territoriali e politiche di coesione: il ruolo dell'Istat 1926-2016Giovanni Barbieri
L’attenzione dell’Istat alle disparità territoriali è indissolubilmente legato alle vicende del Paese: dal centralismo dell'epoca fascista alle diverse concezioni che i partiti di ispirazione liberale, cattolica, socialista e comunista hanno della natura dei divari e dunque delle politiche per la loro riduzione. La definizione geografica del Mezzogiorno, diversa da quella amministrativa, rende arduo il compito delle statistiche. Dalla seconda metà degli anni Ottanta, con l’Atto unico europeo e la riforma dei fondi strutturali, l’Istat è chiamato a dare sostegno di documentazione alle politiche di coesione e si apre un nuovo straordinario periodo di sviluppo delle statistiche territoriali, tuttora in crescita.
La stima del valore aggiunto a livello territoriale: nuovi sviluppi (G. A. Ba...Giovanni Barbieri
Il lavoro si basa sui risultati degli studi e sperimentazioni in corso in Istat per ricercare una soluzione compiuta alla stima del valore aggiunto su base territoriale fine.
Relazioni tra imprese in Italia: un modello interpretativo (A. Righi, E. Pavo...Giovanni Barbieri
Il lavoro affronta il tema della diffusione delle collaborazioni delle imprese tra loro e con altre importanti istituzioni/attori, illustrando le tipologie esistenti (verticale, orizzontale, o altro) e la geografia del fenomeno.
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
State of Artificial intelligence Report 2023kuntobimo2016
Artificial intelligence (AI) is a multidisciplinary field of science and engineering whose goal is to create intelligent machines.
We believe that AI will be a force multiplier on technological progress in our increasingly digital, data-driven world. This is because everything around us today, ranging from culture to consumer products, is a product of intelligence.
The State of AI Report is now in its sixth year. Consider this report as a compilation of the most interesting things we’ve seen with a goal of triggering an informed conversation about the state of AI and its implication for the future.
We consider the following key dimensions in our report:
Research: Technology breakthroughs and their capabilities.
Industry: Areas of commercial application for AI and its business impact.
Politics: Regulation of AI, its economic implications and the evolving geopolitics of AI.
Safety: Identifying and mitigating catastrophic risks that highly-capable future AI systems could pose to us.
Predictions: What we believe will happen in the next 12 months and a 2022 performance review to keep us honest.
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
1. Dealing with Open Data at ISTAT
Giovanni A. Barbieri
Statistics Italy (Istat)
2. The Open Data Movement
Our proposal is simple: […] the federal
government […] is to provide data that is easy
for others to reuse, rather than to help citizens
use the data in one particular way or another
Open infrastructures that enable citizens to
make their own uses of the data
Reverse the current policy, which is to regard
government websites themselves as the primary
vehicle for the distribution of public data, and
open infrastructures for sharing the data as a
laudable but secondary objective [Robinson,Yu,
Zeller and Felten 2008]
2
3. Crowdsourcing Government Transparency
Government information that is nominally
publicly available is in fact difficult to
access either because it is not online or, if
it is online, because it is not available in
useful and flexible formats [Brito 2008]
“Structured data”
Associated structured XML file would allow a
user to sort the data by ascending or
descending date, alphabetically by headline
or author, by number of words, and in many
other ways
3
4. Open Data Ecosystem
“Open data are adding a new dimension to big data analytics and
giving rise to novel, data-driven innovations.”
(McKinsey Global Institute Report, Oct. 2013)
From Citizens IBM BLOG
Wide Range of Open Data
Consumers
Citizens that would like to learn
characteristics of the places they are, e.g.
with mobile apps showing location-
specific features
Journalists that need to access data
for updated and aware information
communication
Educators that are helped in their
teaching task by access to data on
different application domains
4
5. Official Statistics meets Open Data
Official Statistics can more easily reach such wide
range of users if conveyed through open data
Recent technologies advances in the open data
community enable new advanced dissemination
channels for Official Statistics
5
Reinforcing
trust
Getting closer
to users
Reaching new
users
Giving
information
back
Improving
metadata
6. Linked Open Data
Semantic Web
Technological
Standards
OWL
Knowledge
Representation
Linked Open Data - LOD
6
7. Why is Linked Data an Opportunity?
Linked Data as a semantically rich
paradigm for data representation
Rich enough for the strict
requirements of Official Statistics
Formal and well-defined data
structures, i.e. ontologies
Linked Data as an international standard (W3C)
Tools availability and independence
Beyond statistical users
RDF: Resource Description Framework (W3C)
(subject-predicate-object)
7
8. Istat’s Linked Open Data Portal - 1
Istat LOD Portal: http://datiopen.istat.it
English Version: http://datiopen.istat.it/index.php?language=eng
8
9. Platform for
• Selecting
• Navigating
• Searching
• Querying
• Visualizing
Open Data
The platform allows
• Direct access to data via Web Services
• M2M solutions (e.g. GIS-LOD)
• Data conversion
• Export to productivity tools
• Visualization by means of external tools
Istat’s Linked Open Data Portal - 2
9
10. STEP 1 STEP 3STEP 2
Istat’s Linked Open Data Portal - 3
Steps to a «perfect» data portal
10
11. Istat’s Linked Open Data Portal - 4
Guided Access
Freedom of access
Typeofinteraction
Free Access
Human
basic
MachineTo
Machine
Navigation
Guided queries
Query REST on
SPARQL EndPoint
Query via SPARQL
EndPoint
Web Service
Download
Human
Advanced
11
12. Predefined Queries
(Set of simple and
customizable queries)
Free Queries
(SPARQL Queries)
Navigation
Guided Queries
Download
Typeofinteraction
Human
Guided Access Free Access
Interaction Modes
Freedom of access
Basic Advanced
(Human) User
technical skills Intermediate
Free Query via
SPARQL EndPoint
12
13. Use Case 1: Spatial Querying
App that displays on a map some population indicators of the
nearest census sections to specific GPS coordinates
LOD when accompanied by spatial information allow to access data
using spatial queries
13
14. Use Case 2: Federated Querying - 1
Federated query on Istat and ISPRA, i.e. the query accesses Istat and ISPRA portals
With LOD, it is very easy to compare data coming from different sources
(linked for example at territorial level)
Query on one
Portal
Results dynamically
retrieved
from both portals
14
ISPRA - The Italian National
Institute for Environmental
Protection and Research
Istat
15. Use Case 2: Federated Querying - 2
ISTAT data:
Census Buildings
ISPRA data:
Data on land use / soil consumption
Example Query:
Municipality-level analysis of land use / soil consumption and number of
buildings by period of construction
Dynamically generated!
15
16. Use Case 3: Istat as Open Data Provider in SPOD
Social discussion (on the left)
about a graphic representation
of Census Data (on the right)
Dynamically
generated!
16
SPOD: Social Platform for Open Data
17. Conclusions
A dissemination strategy based on open data does
put the Official Statistics users at the centre:
Reaching them through different channels
e.g. apps and social media
Making easier for them to retrieve data
e.g. federated queries that make transparent
the distribution of data on different portals
Providing richer services to them
e.g. spatial querying and dynamical
visualizations
17
Open data and in particular Linked Open
Data have a leading role in data innovation
for Official Statistics
18. • Macroscale vs microscale modeling
– Pseudo-Einstein (as simple as possible but not simpler)
– Von Neumann (agent-based modeling)
• Technological constraint enabling technology
• It widens the space of what is feasible:
– In production: our experience with SBS.Frame
– In analysis and research…
• A paradigm shift?
– Statistical mechanics vs agent-based modeling
– Just because you can doesn’t mean you should
• Back to open data
– From dissemination to release (“data liberation” at StatCan) to the
development of information
– The regulators need to introduce new rules in line with the new
scenarios (Don't think of an elephant!: know your values and frame the
debate)
One More Thing
18
19. Thanks to all my colleagues in Istat contributing to
the LOD Portal
Special thanks to Monica Scannapieco and
Stefano De Francisci
Questions and clarifications: contact me at
barbieri@istat.it
Acknowledgments and Thanks
18
Editor's Notes
Nomi
Tim Berners-Lee inventore del WWW (ipertesto) nel 1989
Our proposal is simple: The new administration should specify that the federal government’s primary objective as an online publisher is to provide data that is easy for others to reuse, rather than to help citizens use the data in one particular way or another [Robinson,Yu, Zeller and Felten 2008]
We argue that when providing data on the Internet, the federal government’s core objective should be to build open infrastructures that enable citizens to make their own uses of the data. If, having achieved that objective, government takes the further step of developing finished sites that rely on the data, so much the better. Our proposal would reverse the current policy, which is to regard government websites themselves as the primary vehicle for the distribution of public data, and open infrastructures for sharing the data as a laudable but secondary objective
Tim Berners-Lee inventore del WWW (ipertesto) nel 1989
Our proposal is simple: The new administration should specify that the federal government’s primary objective as an online publisher is to provide data that is easy for others to reuse, rather than to help citizens use the data in one particular way or another [Robinson,Yu, Zeller and Felten 2008]
We argue that when providing data on the Internet, the federal government’s core objective should be to build open infrastructures that enable citizens to make their own uses of the data. If, having achieved that objective, government takes the further step of developing finished sites that rely on the data, so much the better. Our proposal would reverse the current policy, which is to regard government websites themselves as the primary vehicle for the distribution of public data, and open infrastructures for sharing the data as a laudable but secondary objective
Open data are a key enabler of data-driven innovation. The McKinsey report "Open data: Unlocking innovation and performance with liquid information“ remarks the significant economic value can be generate from open data benefits, “including increased efficiency, new products and services, and a consumer surplus (cost savings, convenience, better products)”. The report also remarks how open data enhances big data’s impact “by creating transparency, exposing variability, and enabling experimentation; helping companies to segment populations and thus to customize actions directed at them; replacing or supporting human decision making; and spurring innovative business models, products, and services”.
There is a wide range of open data consumers. Open data customers range from Citizens, who would like to access data in any time and any location through mobile apps, Journalists, who need timely and trustable data, Educators who can rely on data from different application domains for their teaching materials. From the providers perspective, being them public or private providers, there is the need to offer advanced services flexible enough to meet the requirements of such a wide range of users.
NSOs are producers of data, hence it is a major priority for them to reach data consumers and provide them information of excellent quality.
Open data are an instrument to reach data consumers more easily. There are new advanced technologies and standards available in the open data community that permit to build new advanced dissemination channels for official statistics, as we will see in the rest of the presentation.
When Official Statistics meets Open Data, there are several benefits: (i) Reinforce trust in NSO role, by a data provision that is aligned with state-of-the art standards and technologies and which is «certified» by the quality guarantees of Official Statistics. (ii) Getting closer to users, by an enhanced accessibility that, for instance, permits users to access data by mobile devices. (iii) Reaching new users, by developing services flexible enough to address the requirements of different classes of users (basic and advanced).
(iv) Give back information to users, in a «virtuous» loop where respondents of OS can get back information elaborated on the basis of data they provided (Virtuous loop with Official statistics respondents). (v) Information enrichment, by providing information products with a metadata equipment that permits proper interpretation and usage of such products.
Linked Open Data is a new paradigm for integrating and publishing data. It relies on Semantic Web technologies and standards on one side and on principles of Knowledge Representation on the other side.
Technological standards, mostly proposed by the World Wide Web Consortium (W3C), have the important role to make IT vendors developing tools on the «same» concepts, so ensuring organizations adopting such standards to be vendor-independent.
From the scientific and academic world, knowledge representation pushes the usage of rich metadata models (e.g. ontologies) that can be expressed by semantic Web languages and formats.
The combination of these two worlds (technology standards and knowledge representation) is the basis of the Linked Data paradigm, proposed by Tim-Berners Lee (the inventor of the World Wide Web) in 2006. Linked Open Data (LOD) are Linked Data whose content is open and are the most advanced type of open data.
Tim Berners-Lee inventore del WWW (ipertesto) nel 1989
Istat published its own LOD portal in May 2015.
The portal is the first dissemination system launched by Istat that fully complies with the Italian guidelines to enhance public information as well as the European ones on open data, mainly Revision of the Directive 2003/98/EC on the re-use of public sector information published on 26 june 2013 (Full compliance with Italian and European open data guidelines).
In its first year of publication, the portal had over 23,000 unique visitors, about 400,000 page views and nearly one million individual accesses.
The portal´s main goals are:
Providing a single access point to Istat’s open data as part of the Italian national data cloud, in connection with dati.gov.it, the Italian national open-data portal;
Providing machine-to-machine data access at the finest granularity level (i.e. each single value);
Flexible querying via LOD paradigm;
Advanced navigation mechanisms (faceted browsing, graph browsing, etc.). This means that the user “click-by-click” can discover data, instead of posing a known query: this is the case when the user wants to see what it is offered and does not have in mind a specific dataset to look for. He\She can navigate on graphs, clicking the nodes or by facets (i.e. by data features selected to organize the navigation task).
The Linked Open Data (LOD) web site allows to access and browse data in open format based on technology and Semantic Web standards. The LOD can be inquired directly from any application and responds to the need, expressed by the community of users, to have standardized and interoperable data.
Need to disseminate data from the Italian 15th Census of Population and Housing
Territorial level: Census Section (submunicipality)
402.903 Census Sections
Hard to manage the huge number of Census Sections through our dissemination warehouse I.stat
By means of progressive steps, the LOD portal allows to implement effective solutions for sharing data among Statistical Organizations and making them available to the users communities
This plan shows the degrees of accessibility (Access Freedom axis) vs the user interactions types (Interaction axis).
As represented, there are two main degrees of accessibility, namely: guided access and free access.
The guided access means that the user can rely on some predefined choices in accessing data, while the free one means that the user can pose free queries to the system or can freely navigate it.
The user interaction types distinguishes between machine to machine users (i.e. software applications that access the portal) and human users that can be basic or advanced.
In particular, looking at each quadrant:
Human Basic/Guided: they can perform data download of predefined datasets.
Human Basic/Free: they are able to discover information by using available navigation systems, i.e. systems that click-by-click permit to find out the needed information.
Human Advanced/Guided: they can perform queries but in a guided way, i.e. they do not write the queries from scratch, but such queries are automatically generated through visual systems.
Human Advanced/Free: they can write queries in a free way, without any specific limitation.
MachineTo Machine/Guided: Data available on the portal can be accessed via pre-defined Web services. There are currently two available web services: one for accessing Census Indicators and another one for Labour Market areas.
MachineToMachine/Free: Data available on the portal can be accessed by a software application in a free way. This means that another software application can pose a free query and retrieve data from the portal. Our GIS system, for instance, is able to retrieve data from our LOD portal exactly in this way (internal systems interoperability).
Starting from this slide, we will see three possible use cases concerning the usage of the LOD portal. The three use cases are: spatial querying, federated querying and LOD portal as a data provider for a social platform.
This first use case shows how it is possible to perform spatial queries of LOD data, and, in addition, how to interact with it through a mobile device.
In particular, given some GPS coordinates, the user can retrieve population indicators related to the nearest census sections (nearest to the closest point of the polygon corresponding to the census section) and visualize them on a mobile phone. Thinking of a user query interested to discover how many children are in a certain area, in the figure, it is shown the number of residents with age lower than 5 years for the 6 census sections closer to the centre of the Italian municipality Anguillara Sabazia.
One of the major advantage of the LOD paradigm is the possibility of interconnecting different dissemination systems (data interoperability across different organizations). Indeed, it is possible to integrate data available at different sites so that data can be accessed as if they were available at a single site (location transparency).
The mechanism according to which this is possible is known as «federated queries», that is querying one portal and access to all the other federated portals.
We linked the data of Istat LOD Portal with those of the ISPRA (The Italian National Institute for Environmental Protection and Research) LOD portal. The linkage was performed at territorial level, and in particular at municipality level (through Istat’s municipality codes). Given such a linkage, a query can be posed to Istat portal asking simultaneously for Istat’s data and ISPRA’s data. The query is executed by accessing to both portals in a dynamic way and the result is returned to the user. The fact that also the ISPRA portal is queried is completely transparent for the final users.
Istat is part of National and European statistical systems that would greatly benefit from this concrete data interoperability, namely the National Statistical System (SISTAN) and the European Statistical System (ESS). By adopting this paradigm, it would be much easier to produce «integrated dissemination products» as outputs directly attributable to such systems.
In the next slide a specific example of federated querying is shown.
In this slide, we show by example how to exploit the link beetween Istat LOD portal and ISPRA LOD portal previously mentioned.
Let us suppose that a user needs to retrieve and graphical display a specific combination of Istat’s data and ISPRA’s data, namely municipality-level analysis of soil consumption and buildings (percentage) by period of construction.
He/she can perform three steps:
1. A query can be posed to Istat’s portal for retrieveing data.
2. By exploiting the link beetween the Istat LOD portal and the ISPRA one, data results are retrieved at Istat’s site.
3. Retrieved data are visually showed by an Istat’s application that dynamically takes as input such results and diplays a bar graph based on both Istat’s and ISPRA’s data.
In this approach that federates portals by different organizations, it is very much important to keep information about the provenance of data. This is well supported by the LOD paradigm, and indeed data published on Istat’s LOD portal do have «attached» this kind of metadata that certify the provenance in terms of the agent that published data (Who), the published resource (What) and the time information related to it (When).
The SPOD platform makes a query and retrieve Istat’s data that are shown as a graphic on the right. The graphic reports Italian Resident popoulation by age for the Rome municipality.
On the left there is a discussion carried out by different users on such a dynamically retrieved graphic.
The SPOD platform is indeed an example of a third party subject that accesses Istat’s data and provides advanced services to the final users.
This is a very much promising business model for provision of services based on open data, especially by public institutions like Istat: the open data provider does not have necessarily to spend resources on building and maintaining services based on open data, but, instead, it could be up to third parties to have this specific role.
User-centered dissemination is strongly enabled by open data.
The use cases have shown:
Users can be reached through different channels that are close to where they are (mobile apps) and what they are used to do (social media).
Users can retrieve data in an easier way, due to federated queries that make transparent the distribution of data across federated portals.
Users can benefit from richer services like spatial querying and dynamical visualizations.
Though NSOs are adopting open data strategies since several years, Linked Open Data can give new impetus to such strategies.
That’s why we assign to open data and in particular to Linked Open Data a strategic role in the innovation landscape for Official Statistics.