All presentations given during the ePSI platform webinar that was held on 8 February 2016.
The agenda of the webinar:
1) Opening by the European Commission.
2) Introduction to the EDP project
3) Demo of the Portal
4) Technical architecture
5) Focus on CKAN extensions developed
6) Focus on maps application
7) Next steps
8) Discussion and tips & tricks for open data implementation
Presentation describing the purpose of the European Data Portal project. The launching of the European Data Portal is one of the key steps the European Commission is taking in supporting the access to public data.
Introduction: The Big Data Europe Project at the: CMG-AE Event: Big Data: Strategien, Technologien und Nutzen
19th of May 2015, Expat Center der Wirtschaftsagentur, Vienna, Austria
See: http://www.big-data-europe.eu
Project Description of the Linked Open Data (LOD) PILOT Austria - presented at the PiLOD event at VU Amsterdam (Netherlands) on 29.01. 2014 (see: http://www.pilod.nl/) by Martin Kaltenböck of Semantic Web Company.
The EUnetHTA perspective on the HTA databasePatrice Chalon
Presentation at HTAi Annual meeting 2017, panel session "Rescuing the HTA database – future options and challenges"
A central, international database for HTA reports and other HTA products is considered to be a vital source of information for healthcare researchers and stakeholders. The current HTA database, which contains over 15,000 documents submitted by numerous HTA agencies, was originally established by the International Network of Agencies for Health Technology Assessment (INAHTA) in 2007 and is available on the website of the UK Centre for Reviews and Dissemination (CRD). The database has so far been funded by the UK National Institute for Health Research (NIHR). Its existence is however endangered as future funding is unclear. If no alternative is found, it will no longer be maintained and only an archived version will be available. There would thus no longer be a single access point to HTA reports.
Previous research has indicated that more than 75% of HTA agencies use the HTA database and more than half adapt common HTA products from reports produced by other agencies.1 The lack of an HTA database would have a direct impact on these activities. Smaller HTA agencies would be particularly affected, as they often have insufficient resources to produce their own reports and rely on reports from larger agencies. The wider consequences should also be considered: for instance, the decreasing visibility of HTA reports would diminish their relevance. It may also become more difficult to establish collaborations between HTA agencies. The problem would thus affect the whole HTA community.
Often, however, a crisis also offers opportunities. The establishment of a new HTA database should include a re-evaluation of its structure and technical functions (e.g. inclusion of ongoing projects).
Structure of the session: Short presentations will be held to provide an overview of the different perspectives of the various HTA agencies and networks currently involved in the discussions on the future of the HTA database. The panel will focus on the initiatives to rescue the database and present the options for funding, hosting, structure and technical functions. There will also be a guided discussion on the possible solutions presented and the challenges faced.
Panel/Workshop outcome and objectives: At the end of the session, participants should be aware of the overall importance of the HTA database, the current status quo, and the potential features of a future HTA database.
EDF2013: Selected Talk, Ghislain Atemezing: Towards Interoperable Visualizati...European Data Forum
Selected talk of Ghislain Atemezing at the European Data Forum 2013, 10 April 2013 in Dublin, Ireland: Towards Interoperable Visualization Applications Over Linked Data
Presentation describing the purpose of the European Data Portal project. The launching of the European Data Portal is one of the key steps the European Commission is taking in supporting the access to public data.
Introduction: The Big Data Europe Project at the: CMG-AE Event: Big Data: Strategien, Technologien und Nutzen
19th of May 2015, Expat Center der Wirtschaftsagentur, Vienna, Austria
See: http://www.big-data-europe.eu
Project Description of the Linked Open Data (LOD) PILOT Austria - presented at the PiLOD event at VU Amsterdam (Netherlands) on 29.01. 2014 (see: http://www.pilod.nl/) by Martin Kaltenböck of Semantic Web Company.
The EUnetHTA perspective on the HTA databasePatrice Chalon
Presentation at HTAi Annual meeting 2017, panel session "Rescuing the HTA database – future options and challenges"
A central, international database for HTA reports and other HTA products is considered to be a vital source of information for healthcare researchers and stakeholders. The current HTA database, which contains over 15,000 documents submitted by numerous HTA agencies, was originally established by the International Network of Agencies for Health Technology Assessment (INAHTA) in 2007 and is available on the website of the UK Centre for Reviews and Dissemination (CRD). The database has so far been funded by the UK National Institute for Health Research (NIHR). Its existence is however endangered as future funding is unclear. If no alternative is found, it will no longer be maintained and only an archived version will be available. There would thus no longer be a single access point to HTA reports.
Previous research has indicated that more than 75% of HTA agencies use the HTA database and more than half adapt common HTA products from reports produced by other agencies.1 The lack of an HTA database would have a direct impact on these activities. Smaller HTA agencies would be particularly affected, as they often have insufficient resources to produce their own reports and rely on reports from larger agencies. The wider consequences should also be considered: for instance, the decreasing visibility of HTA reports would diminish their relevance. It may also become more difficult to establish collaborations between HTA agencies. The problem would thus affect the whole HTA community.
Often, however, a crisis also offers opportunities. The establishment of a new HTA database should include a re-evaluation of its structure and technical functions (e.g. inclusion of ongoing projects).
Structure of the session: Short presentations will be held to provide an overview of the different perspectives of the various HTA agencies and networks currently involved in the discussions on the future of the HTA database. The panel will focus on the initiatives to rescue the database and present the options for funding, hosting, structure and technical functions. There will also be a guided discussion on the possible solutions presented and the challenges faced.
Panel/Workshop outcome and objectives: At the end of the session, participants should be aware of the overall importance of the HTA database, the current status quo, and the potential features of a future HTA database.
EDF2013: Selected Talk, Ghislain Atemezing: Towards Interoperable Visualizati...European Data Forum
Selected talk of Ghislain Atemezing at the European Data Forum 2013, 10 April 2013 in Dublin, Ireland: Towards Interoperable Visualization Applications Over Linked Data
EDF2014: Piek Vossen, Professor Computational Lexicology, VU University Amste...European Data Forum
Invited Talk of Piek Vossen, Professor Computational Lexicology, VU University Amsterdam, Netherlands at the European Data Forum 2014, 19 March 2014 in Athens, Greece: NewsReader: recording history by processing massive streams of daily news
EDF2014: BIG - NESSI Networking Session: Nuria de Lama, Representative to the...European Data Forum
BIG - NESSI Networking Session, Talk by Nuria de Lama, Representative to the European Commission, Research & Innovation ATOS, Spain at the European Data Forum 2014, 20 March 2014 in Athens, Greece: Towards a Big Data Public Private Partnership
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...BigData_Europe
Talk at the Big Data Europe SC6 workshop number 3 taking place on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference: The Big Data Europe Platform: Apps, challenges, goals by Aad Versteden, TenForce.
EDF2014: Franck Cotton & Kamel Gadouche, France: TeraLab - A Secure Big Data...European Data Forum
Selected Talk of Franck Cotton, Technology Advisor, Institut National de la Statistique et des Etudes Economiques, France & Kamel Gadouche, Director, Centre d'Accès Sécurisé aux Données / Groupe des Ecoles Nationales d'Economie et Statistique, France at the European Data Forum 2014, 19 March 2014 in Athens, Greece: TeraLab - A Secure Big Data Platform, Description And Use Cases
The Census Hub Project can be considerated at the moment as the most advanced project where Internet technologies and SDMX solutions for data transmission get together for an ambicious goal: the data dissemination of Census 2011 results.
We analyze the Census Hub architecture, where a central Hub at Eurostat side manage the user interface, transforming all selections made by the user on the screen in an sdmx query. This query is sent to the web service at NSI side, that parses the query and transforms it in an SQL query that can be used with a data base containing census data. Depending on how many countrys are involved in the answer, the hub will query the web service provided for that country. Finally, the Hub receive all answer fron NSI's and build up a final table, putting all answers toghether. The importance of this implementation is that is a completely new system that change completely the way to disseminate and exchange official data among organizations.
Gergely Sipos (EGI): Exploiting scientific data in the international context ...Gergely Sipos
Keynote presentation given at "The Emerging Technology Forum – Data Creates Universe - Scientific Data Innovation Conference" of the "Pujiang Innovation Forum 2021" event.
The EOSC Compute Platform with the EGI-ACE project EGI Federation
EGI-ACE’s main goal is to implement the compute platform of the European Open Science Cloud and contribute to the EOSC Data Commons by delivering integrated computing platforms, data spaces and tools as an integrated solution that is aligned with major European cloud federation projects and HPC initiatives.
This presentation introduces you to the architecture and composition of the EOSC Compute Platform, which delivers capabilities at the IaaS, PaaS and SaaS level.
The INSPIRE 20187 Hackathon is co-organized with Plan4all and other partners. The presentation is about the APIs4DGov study, its goal and motivation for the participation at the INSPIRE Hackathon. Plus, some useful resources for the hackathon are illustrated.
Big Data Europe at eHealth Week 2017: Linking Big Data in HealthBigData_Europe
Of the four V's of big data – Volume, Velocity, Variety and Veracity – the most challenging for the health sector is Variety. Health data comes from many sources, formats and standards – how can we bring these together to reap the benefits of big data technologies?
Big Data Europe is tackling this challenge head-on, building a big data infrastructure flexible enough to tackle all seven Societal Challenges identified by Horizon 2020. Here we demonstrate our pilot implementation of Open PHACTS, which integrates life science data for drug discovery.
12 May 2017
Review of user story best practices, and a dozen tips to help manage your agile Drupal project more efficiently. Presentation focuses on use of JIRA and agile practices.
EDF2014: Piek Vossen, Professor Computational Lexicology, VU University Amste...European Data Forum
Invited Talk of Piek Vossen, Professor Computational Lexicology, VU University Amsterdam, Netherlands at the European Data Forum 2014, 19 March 2014 in Athens, Greece: NewsReader: recording history by processing massive streams of daily news
EDF2014: BIG - NESSI Networking Session: Nuria de Lama, Representative to the...European Data Forum
BIG - NESSI Networking Session, Talk by Nuria de Lama, Representative to the European Commission, Research & Innovation ATOS, Spain at the European Data Forum 2014, 20 March 2014 in Athens, Greece: Towards a Big Data Public Private Partnership
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...BigData_Europe
Talk at the Big Data Europe SC6 workshop number 3 taking place on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference: The Big Data Europe Platform: Apps, challenges, goals by Aad Versteden, TenForce.
EDF2014: Franck Cotton & Kamel Gadouche, France: TeraLab - A Secure Big Data...European Data Forum
Selected Talk of Franck Cotton, Technology Advisor, Institut National de la Statistique et des Etudes Economiques, France & Kamel Gadouche, Director, Centre d'Accès Sécurisé aux Données / Groupe des Ecoles Nationales d'Economie et Statistique, France at the European Data Forum 2014, 19 March 2014 in Athens, Greece: TeraLab - A Secure Big Data Platform, Description And Use Cases
The Census Hub Project can be considerated at the moment as the most advanced project where Internet technologies and SDMX solutions for data transmission get together for an ambicious goal: the data dissemination of Census 2011 results.
We analyze the Census Hub architecture, where a central Hub at Eurostat side manage the user interface, transforming all selections made by the user on the screen in an sdmx query. This query is sent to the web service at NSI side, that parses the query and transforms it in an SQL query that can be used with a data base containing census data. Depending on how many countrys are involved in the answer, the hub will query the web service provided for that country. Finally, the Hub receive all answer fron NSI's and build up a final table, putting all answers toghether. The importance of this implementation is that is a completely new system that change completely the way to disseminate and exchange official data among organizations.
Gergely Sipos (EGI): Exploiting scientific data in the international context ...Gergely Sipos
Keynote presentation given at "The Emerging Technology Forum – Data Creates Universe - Scientific Data Innovation Conference" of the "Pujiang Innovation Forum 2021" event.
The EOSC Compute Platform with the EGI-ACE project EGI Federation
EGI-ACE’s main goal is to implement the compute platform of the European Open Science Cloud and contribute to the EOSC Data Commons by delivering integrated computing platforms, data spaces and tools as an integrated solution that is aligned with major European cloud federation projects and HPC initiatives.
This presentation introduces you to the architecture and composition of the EOSC Compute Platform, which delivers capabilities at the IaaS, PaaS and SaaS level.
The INSPIRE 20187 Hackathon is co-organized with Plan4all and other partners. The presentation is about the APIs4DGov study, its goal and motivation for the participation at the INSPIRE Hackathon. Plus, some useful resources for the hackathon are illustrated.
Big Data Europe at eHealth Week 2017: Linking Big Data in HealthBigData_Europe
Of the four V's of big data – Volume, Velocity, Variety and Veracity – the most challenging for the health sector is Variety. Health data comes from many sources, formats and standards – how can we bring these together to reap the benefits of big data technologies?
Big Data Europe is tackling this challenge head-on, building a big data infrastructure flexible enough to tackle all seven Societal Challenges identified by Horizon 2020. Here we demonstrate our pilot implementation of Open PHACTS, which integrates life science data for drug discovery.
12 May 2017
Review of user story best practices, and a dozen tips to help manage your agile Drupal project more efficiently. Presentation focuses on use of JIRA and agile practices.
Course material for An introduction to keeping financial records in business for small and medium sized enterprises organised by the Lagos hub of the Global Shapers with support from Abraaj for small and medium enterprises in Lagos NIgeria held in June 2016
PoserTut101: How to rig a character in Poser Talk designer for LipsyncAndreas Ziebart
Speech is one of the most important aspects of many animations. Due to the complexity of the human face, it is also one of the most challenging things to animate realistically.
Normally, you will find the characters you want to use are already pre-rigged, and configured.
Most of Daz3D products have been professionally designed. They include phoneme morphs. And the associated Daz Mimic Configuration file.
Likewise…Most of Posers out of box content is similarly designed with phonemes, and XML configuration files for Talk Designer.
I picked up a figure from Vanishing Point which I wanted to use in some animations. The problem… It was not rigged for Mimic or Poser.
The European Commission’s new machine translation system has been available since June 2013. It was built by the Directorate General for Translation (DGT) using Moses and the EU institutions’ translation memories, stored in the Euramis database. It is continuously improving through close collaboration with EC translators, and regular inclusion of their more recent translations. MT@EC will be the starting point for an "automated translation platform" to be funded by the Connecting Europe Facility in order to support multilingualism of other European digital service infrastructures.
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit. MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme.
For the latest updates go to http://www.statmt.org/mosescore/
or follow us on Twitter - #MosesCore
Discover the content and approach followed for the development of the European Data Portal. this portal, released in November 2015 aims at referencing open data made available in up to 39 different European countries.
Wide-spread adoption of MT software by end-users and language service providers is still hindered by the need to train custom MT systems, the lack of training data for some language pairs and domains as well as substantive hardware needs. The EU-funded MMT project addresses these barriers by providing cloud-ready software with a simple installation procedure, very fast setup times for MT systems, instant domain adaptation and integration of new data and high scalability. To address the data gap between the large web companies and the MT industry, MMT is also curating and collecting data from translation stakeholders and the web to improve MT quality for everybody. In this session we present the latest developments in this ongoing project and demo the service.
Join us for an exciting session where you can explore the future of open-source technology and innovation projects based on FIWARE and supported by FIWARE team. Our experts will share their insights on how FIWARE standards and components are revolutionizing the tech industry. Attendees will have the opportunity to hear from a variety of presenters showcasing ongoing projects that leverage FIWARE. From dynamic digital twins to satellite open data for smart cities, to boosting EU high-value datasets, this session will provide a glimpse into the exciting possibilities of FIWARE technology. Don't miss out on this chance to learn about the latest innovation and collaborate with like-minded professionals!
This presentation was given within the event "Trasparenza e dato pubblico: due asset strategici al servizio dell'interesse generale", organised by Polis Lombardia.
My intervention followed the first part of the "I dati pubblici in Lombardia e in Europa: una fonte rinnovabile di energia informativa" section. The first part was presented by MArco Panebianco of ARIA.
During my talk, I have briefly introduced the context and last results of the Joint Research Centre (JRC), European Commission, studies about Application Programming Interfaces (APIs) strategies and implementation in the public sector.
For more information about the API studies the JRC is conducting please consult: https://joinup.ec.europa.eu/collection/api4dt
Demonstration of the functionality an capabilities of Virtual Hubs developed for ENERGIC-OD project. Presentation given in GEO Business 2017 trade show in London, UK (May 2017).
LandCity Revolution - L'evoluzione del segmento di terra per sostenere l'era ...giovanni biallo
La diffusione di immagini satellitari fruibili gratuitamente e liberamente, come i dati del progetto Copernicus, apre nuovi orizzonti in termini di prodotti e servizi,
Putting the L in front: from Open Data to Linked Open DataMartin Kaltenböck
Keynote presentation of Martin Kaltenböck (LOD2 project, Semantic Web Company) at the Government Linked Data Workshop in the course of the OGD Camp 2011 in Warsaw, Poland: Putting the L in front: from Open Data to Linked Open Data
Refinement
Europeana Newspapers Workshop: A Gateway to European Newspapers Online. Research Information Infrastructures and the Future Role of Libraries.
LIBER 2013 Annual Conference, Bavarian State Library, 26-29 June 2013, Munich, Germany.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
2. Why does Open Data matter?
4 key macro economic indicators for the EU28+ by 2020
Market Size and value added as percentage of GDP
The market size is defined through the market volume (realized sales volume ) and
the market potential.
Number of Jobs Created
Number of additional jobs (in persons) created directly related to Open Data for
the private sector.
Cost Savings for the public sector
Savings include time saved for public administrators by using Open Data,
eliminated transaction costs for providing data, realizing more transparency.
Efficiency Gains or productivity gains
Qualitative assessment of indirect benefits for citizens. This can be in terms of
time saved, lives saved, productivity gains.
€ 75.7 billion
7,000 lives
16% energy
25,000 jobs
€ 1.7 billion
3. The Portal itself
+398K metadata sets
34 countries covered
67 catalogues
13 categories to structure metadata
A Goldbook for data holders
13 eLearning modules
A library with the material from Open Data
Support
Featured highlights
Latest Open Data news
Several feedback functions
Help us improve
Be harvested by us
Tell us your story
6. Multiple Learning experiences brought online
A Goldbook for data providers
Open Data in a Nutshell
How to build an Open Data Strategy
Technical preparation & implementation
Putting in place an Open Data lifecycle
Ensuring and monitoring success
Step 1: Introduction
to Open Data
Step 2: Open Data
Implementation
Step 3: Technical
deep dive
Step 4: Where next
with Open Data?
eLearning Modules Library
Introduction to Open
Government Data
Introduction to
Linked Data
More material put
online soon!
Featured Highlights
Content promotion
Economic Analysis
Open Data
landscaping
CEF Open Data Call
ODINE
7. The European Data Portal also aims at fostering the reuse of
public data resources
Four complementary work streams are focused on reuse
Leveraging community engagement to make the most out of Open Data
Communicating and raising awareness about the Portal
Studying the economic impact of the reuse of public data resources
Preparing for the future and working on the sustainability
Fostering
uptake on reuse
of public data
resources
Market
Size
Jobs
Created
Cost
Savings
Efficiency
Gains
Metrics to measure economic impact
Economic Benefits of Open DataCommunity Engagement
Reports to drive understanding and reuse
Topic 1: Open Data and Digital Transformation
Topic 2: Open Data and eSkills
Future actions will consist in making the most of domain specific activities: weaving our activities
into existing communities rather than creating communities of our own
Topic 3: Open Data and Privacy
29. CKAN Extension
3 major and a few minor extensions
Support for Multilingual Metadata
Automatic machine translation service
Extending JSON schema
Covering all DCAT-AP properties
Synchronize with Triple Store
Using SPARQL
Theming and Minor Features
Bootstrap 3, RDF Export, Gazetteer, Map-App Preview, Visualization, License Assistant, …
30. CKAN Extension
Extra fields “translation” and
“translation_meta”
Using machine translation service (MT@EC)
Solr schema
For each translated field an additional field in the
Solr schema was included
Have language specific settings, like appropriate
stemming functions
Depending on the search and language settings of
the browser, the Solr search query included the
respective fields, e.g. when searching for German
and French only the German and French fields are
used
Multilingual Support
"translation_meta": {
"default": "en",
"languages": ["en", "fr", "de"],
"en": {
"received": "2016-01-19T11:57:12.944031",
"tag": "en„
},
"fr": {
"issued": "2016-01-19T11:57:12.944031",
"tag": "fr-t-en-t0-mtec",
"received": "2016-01-20T01:53:17.249751„
},
"de": {
"issued": "2016-01-19T11:57:12.944031",
"tag": "de-t-en-t0-mtec",
"received": "2016-01-20T01:53:17.249751„
}
}
"translation": {
"fr": {
"notes": "Le présent rapport est destiné à compléter les rapports
hfas Greater Manchester...",
"title": "Statistiques des transports Rochdale„
},
"de": {
"notes": "Dieser Bericht hfas Berichte ergänzen soll,
„Verkehrsstatistik“ und „Greater ...",
"title": "Verkehrsstatistik Rochdale„
}
}
31. CKAN Extension
CKAN uses a flat JSON-based key-value data structure
Core fields and schema is fixed
Extension of arbitrary fields is possible
DCAT-AP uses RDF with fixed properties
The range of many properties is open and not fixed
Challenge: Handling the semantic gap and map abitrary source data to DCAT-AP and CKAN!
Different point of views:
User view: API and Frontend
Internal Processing
JSON Schema
32. CKAN Extension
Map DCAT-AP classes to fitting CKAN concepts
Dataset Package
Catalogue Organization
Map each DCAT-AP property to a semantic equivalent CKAN core field (if possible)
E.g. : dct:description notes
Create new fields in CKAN for every property not covered
E.g. : dct:contactPoint contact_point (extra field)
DCAT-AP covers much more meta data than CKAN
More than 25 extra fields were added
Many core CKAN field have no equivalent
Mapping
33. CKAN Extension
RDF offers a complex and flexible data structure
CKAN’s plain JSON offers basic structures: Numbers, Dictionaries, Strings and Lists
Obvious solution: JSON-LD – To complex to integrate into CKAN for now
Creation of basic data structure mapping
Issue: Ranges of DCAT-AP properties are wide open
Example:
dct:contactPoint has the range vcard:Kind
vcard:Kind defines a variety of valid properties
Solution: Map only the most “common” in CKAN
Map cardinalities into JSON structures: List or single dictionary
If linked to external resource, the URI is stored in addition
Data Structures
34. CKAN Extension
Example
"contact_point": [
{
"type": "http://www.w3.org/2006/vcard/ns#Organization",
"resource": "http://www.organization.resource",
"email": "mailto:geologie.daten@bgr.de",
"name": "Bundesanstalt für Geowissenschaften und Rohstoffe"
}
]
35. CKAN Extension
Implemented as a CKAN Extension based on RDFLib
Harvester or other clients create datasets in CKAN via the API in “CKAN-Style”
The data is transformed into RDF
RDF is stored in Virtuoso
Base is the CKAN API and not RDF (at the moment)
RDF is available via the SPARQL Endpoint of Virtuoso
Human readable information is rendered in the Frontend (Labels etc.)
The Mapping is not perfect, yet!
Data Structures
36. CKAN Extension
Each CRUD operation (except read) is synchronized with the triple store
Hook into package_show, package_create, package_delete logical functions (same for
resource functions)
API and Frontend included
RDFlib + SPARQLWrapper
Connecting to Tripple Store
37. CKAN Extension
Theming using Bootstrap 3 for better harmonizing with Drupal
Exporting RDF
Datasets can be returned in rdf/xml or n3 format
Gazetteer
License Assistant
Map-App Preview
Visualization of Excel Data
Integrated Features
39. The role of con terra
Mission - Bridging the Gap!
Geospatial
Data Open
Infrastructures Data
Infrastructures
EDP Geospatial Components
Maps Visualization of Geospatial Data
Gazetteer Spatial Search and Filtering
Harvesting Geospatial Metadata - Mapping ISO 19139 to DCAT-AP
40. Maps: Visualization of Geospatial Data
Basemaps by
Eurostat
Displaying Dynamic Content (WMS &
GeoJSON) from Data Providers
Displayed using
map.apps
42. Gazetteer: Spatial Search
Based on: Geographical Names
(INSPIRE Annex I)
Added: Geonames.org Data for
Exonyms
Added: Member States Bulk
Data (e.g. shp)
Using: Continuous Automated
Update Processes with FME and
smart.finder
Enables Spatial Search (e.g. cities and states) and Spatial Filtering (Extend)
44. Metadata for geospatial resources based on DCAT-AP
GeoDCAT-AP
Mapping metadata from prevalent existing geospatial metadata
standards to DCAT-AP (INSPIRE / ISO 19115 Core)
No new metadata schema made „from scratch“,
no replacement of existing standards
Adapter Architecture
Adapters provide a uniform interface for harvesting:
OAI-PMH
Adapters use geospatial catalog interface standards to
access remote geo-catalogues
XSLT-based Transformation
of ISO/INSPIRE metadata to DCAT-AP
Harvesting: Geospatial Metadata (II)
45. Harvesting: Geospatial Metadata (III)
Harvest existing Geo-catalogues (INSPIRE) Map ISO/INSPIRE Metadata to (Geo)DCAT-AP
Adapter Architecture RESTful API