Providing open data is of interest for its societal and commercial value, for transparency, and because more people can do fun things with data. There is a growing number of initiatives to provide open data, from, for example, the UK government and the World Bank. However, much of this data is provided in formats such as Excel files, or even PDF files. This raises the question of
- How best to provide access to data so it can be most easily reused?
- How to enable the discovery of relevant data within the multitude of available data sets?
- How to enable applications to integrate data from large numbers of formerly unknown data sources?
One way to address these issues to to use the design principles of linked data (http://www.w3.org/DesignIssues/LinkedData.html), which suggest best practices for how to publish and connect structured data on the Web. This presentation gives an overview of linked data technologies (such as RDF and SPARQL), examples of how they can be used, as well as some starting points for people who want to provide and use linked data.
The presentation was given on August 8, at the Hacknight event (http://hacknight.se/) of Forskningsavdelningen (http://forskningsavd.se/) (Swedish: “Research Department”) a hackerspace in Malmö.
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...LIBER Europe
A presentation by Dr. Liz Lyon of the United Kingdom Office for Library and Information Networking, as given at LIBER's 42nd annual conference in Munich, Germany.
A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...LIBER Europe
This talk was given by Prof. Geoffrey Boulton of the University of Edinburgh at LIBER's 42nd annual conference in Munich. Here is a brief summary: "The data storm that has been unleashed by novel means of data acquisition, manipulation and their instantaneous communication have posed both great challenges and opportunities for science. The challenge is to maintain scientific self-correction, which depends on concurrent publication of concepts and the underlying evidence. The opportunity is to exploit massive and complex data volumes in creating new knowledge. Both are non-trivial tasks. The former requires ‘intelligent openness‘."
"The latter requires new ways of thinking and new forms of collaboration, which make major demands on scientists, their institutions, those that fund science and those who publish it. Open access publishing is important, but open data is fundamental to scientific progress."
"In a post-Gutenberg era, can the library maintain its historic role as an efficient repository of scientific knowledge? Can it provide support for the creation of new knowledge? What responsibilities should it discharge, and how? What skills are required by those discharging the library function? And how do we achieve a realisable objective, of having all the publications online, all the data online, and for the two to be interoperable?"
Learn more about LIBER at www.libereurope.eu
Enabling Data-Intensive Science Through Data InfrastructuresLIBER Europe
These slides are from a talk given at LIBER's 42nd annual conference by Carlos Morais Pires of the European Commission.
In light of the current data deluge, and plans by the European Commission to harness this deluge through the implementation of e-infrastructures for data driven science under Horizon 2020, Pires issued a call to action to libraries to engage in the data infrastructure and bring their own unique, and now much needed competencies, to bear in bringing meaning to, and spreading the word about, data-driven science.
EDF2014: Vedran Sabol, Head of the Knowledge Visualisation Area, Know-Center,...European Data Forum
Selected Talk by Vedran Sabol, Head of the Knowledge Visualisation Area, Know-Center, Austria at the European Data Forum 2014, 19 March 2014 in Athens, Greece: CODE - Linked Data in Context: Questions Matter
Providing open data is of interest for its societal and commercial value, for transparency, and because more people can do fun things with data. There is a growing number of initiatives to provide open data, from, for example, the UK government and the World Bank. However, much of this data is provided in formats such as Excel files, or even PDF files. This raises the question of
- How best to provide access to data so it can be most easily reused?
- How to enable the discovery of relevant data within the multitude of available data sets?
- How to enable applications to integrate data from large numbers of formerly unknown data sources?
One way to address these issues to to use the design principles of linked data (http://www.w3.org/DesignIssues/LinkedData.html), which suggest best practices for how to publish and connect structured data on the Web. This presentation gives an overview of linked data technologies (such as RDF and SPARQL), examples of how they can be used, as well as some starting points for people who want to provide and use linked data.
The presentation was given on August 8, at the Hacknight event (http://hacknight.se/) of Forskningsavdelningen (http://forskningsavd.se/) (Swedish: “Research Department”) a hackerspace in Malmö.
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...LIBER Europe
A presentation by Dr. Liz Lyon of the United Kingdom Office for Library and Information Networking, as given at LIBER's 42nd annual conference in Munich, Germany.
A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...LIBER Europe
This talk was given by Prof. Geoffrey Boulton of the University of Edinburgh at LIBER's 42nd annual conference in Munich. Here is a brief summary: "The data storm that has been unleashed by novel means of data acquisition, manipulation and their instantaneous communication have posed both great challenges and opportunities for science. The challenge is to maintain scientific self-correction, which depends on concurrent publication of concepts and the underlying evidence. The opportunity is to exploit massive and complex data volumes in creating new knowledge. Both are non-trivial tasks. The former requires ‘intelligent openness‘."
"The latter requires new ways of thinking and new forms of collaboration, which make major demands on scientists, their institutions, those that fund science and those who publish it. Open access publishing is important, but open data is fundamental to scientific progress."
"In a post-Gutenberg era, can the library maintain its historic role as an efficient repository of scientific knowledge? Can it provide support for the creation of new knowledge? What responsibilities should it discharge, and how? What skills are required by those discharging the library function? And how do we achieve a realisable objective, of having all the publications online, all the data online, and for the two to be interoperable?"
Learn more about LIBER at www.libereurope.eu
Enabling Data-Intensive Science Through Data InfrastructuresLIBER Europe
These slides are from a talk given at LIBER's 42nd annual conference by Carlos Morais Pires of the European Commission.
In light of the current data deluge, and plans by the European Commission to harness this deluge through the implementation of e-infrastructures for data driven science under Horizon 2020, Pires issued a call to action to libraries to engage in the data infrastructure and bring their own unique, and now much needed competencies, to bear in bringing meaning to, and spreading the word about, data-driven science.
EDF2014: Vedran Sabol, Head of the Knowledge Visualisation Area, Know-Center,...European Data Forum
Selected Talk by Vedran Sabol, Head of the Knowledge Visualisation Area, Know-Center, Austria at the European Data Forum 2014, 19 March 2014 in Athens, Greece: CODE - Linked Data in Context: Questions Matter
Recommendations for Open Online Education: An Algorithmic StudyHendrik Drachsler
Recommending courses to students in online platforms is studied widely. Almost all studies target closed platforms, that belong to a University or some other educational provider. This makes the course recommenders situation specific. Over the last years, a demand has developed for recommender system that suit open online platforms. Those platforms have some common characteristics, such as the lack of rich user profiles with content metadata. Instead they log user interactions within the platform that can be used for analysis and personalization. In this paper, we investigate how user interactions and activities tracked within open online learning platforms can be used to provide recommendations. We present a study in which we investigate the application of several state-of-the-art recommender algorithms, including a graph-based recommender approach. We use data from the OpenU open online learning platform that is in use by the Open University of the Netherlands. The results show that user-based and memory-based methods perform better than model-based and factorization methods. Particularly, the graph-based recommender system proves to outperform the classical approaches on prediction accuracy of recommendations in terms of recall. We conclude that, if the algorithms are chosen wisely, recommenders can contribute to a better experience of learners in open online courses.
Soude Fazeli, Enayat Rajabi, Leonardo Lezcano, Hendrik Drachsler, Peter Sloep
Co-designing Research IT and Research Data ServicesSimon Price
Invited talk about evolving plans for Research IT and Data support at the University of Bristol, given at the UCISA Research IT International Symposium, UCISA 2014 Conference, Brighton.
OpenMinted: It's Uses and Benefits for the Social Sciencesopenminted_eu
Presentation as presented at the ITOC workshop in Philadelphia, 20 February 2016.
Uses and Benefits for the Social Sciences research community.
By GESIS - Leibniz Institute for the Social Sciences
Project MILDRED: Charting Ground for Research Data Management Services at Uni...Mari Elisa Kuusniemi
Introduction: This paper describes a topical case study conducted at University of Helsinki. Current states of research data management (RDM) practices within the academic community have been under close scrutiny during summer 2016 in Project MILDRED, Development Project of Research Data Infrastructure at University of Helsinki (UH).
Linked Open Data Approaches within the ARIADNE Projectariadnenetwork
Holly Wright
Archaeology Data Service (ADS), UK
EAA 2016, Vilnius, Lithuania
Session: Open Access and Open Data in Archaeology -
Following the ARIADNE Thread
Recommendations for Open Online Education: An Algorithmic StudyHendrik Drachsler
Recommending courses to students in online platforms is studied widely. Almost all studies target closed platforms, that belong to a University or some other educational provider. This makes the course recommenders situation specific. Over the last years, a demand has developed for recommender system that suit open online platforms. Those platforms have some common characteristics, such as the lack of rich user profiles with content metadata. Instead they log user interactions within the platform that can be used for analysis and personalization. In this paper, we investigate how user interactions and activities tracked within open online learning platforms can be used to provide recommendations. We present a study in which we investigate the application of several state-of-the-art recommender algorithms, including a graph-based recommender approach. We use data from the OpenU open online learning platform that is in use by the Open University of the Netherlands. The results show that user-based and memory-based methods perform better than model-based and factorization methods. Particularly, the graph-based recommender system proves to outperform the classical approaches on prediction accuracy of recommendations in terms of recall. We conclude that, if the algorithms are chosen wisely, recommenders can contribute to a better experience of learners in open online courses.
Soude Fazeli, Enayat Rajabi, Leonardo Lezcano, Hendrik Drachsler, Peter Sloep
Co-designing Research IT and Research Data ServicesSimon Price
Invited talk about evolving plans for Research IT and Data support at the University of Bristol, given at the UCISA Research IT International Symposium, UCISA 2014 Conference, Brighton.
OpenMinted: It's Uses and Benefits for the Social Sciencesopenminted_eu
Presentation as presented at the ITOC workshop in Philadelphia, 20 February 2016.
Uses and Benefits for the Social Sciences research community.
By GESIS - Leibniz Institute for the Social Sciences
Project MILDRED: Charting Ground for Research Data Management Services at Uni...Mari Elisa Kuusniemi
Introduction: This paper describes a topical case study conducted at University of Helsinki. Current states of research data management (RDM) practices within the academic community have been under close scrutiny during summer 2016 in Project MILDRED, Development Project of Research Data Infrastructure at University of Helsinki (UH).
Linked Open Data Approaches within the ARIADNE Projectariadnenetwork
Holly Wright
Archaeology Data Service (ADS), UK
EAA 2016, Vilnius, Lithuania
Session: Open Access and Open Data in Archaeology -
Following the ARIADNE Thread
CLARIAH Toogdag 2018: A distributed network of digital heritage informationEnno Meijers
Slides of my keynote at the CLARIAH Toogdag 2018 on 9 March at the National Library of the Netherlands. The main topics were the development of the distributed digital heritage network and the alignment to and cooperation with the CLARIAH infrastructure and data. It also points at some of the current limitations of the semantic web technology.
A distributed network of digital heritage information - Unesco/NDL IndiaEnno Meijers
These slides were presented at the Knowledge Engeneering for Digital Library Design Workshop in New-Delhi on 25 October 2017. The Workshop was organised by Unesco and the National Digital Library of India.
Enabling Complex Analysis of Large-Scale Digital Collections: Humanities Rese...James Baker
Talk at Digital Humanities 2016 with Melissa Terras, James Hetherington, David Beavan, Anne Welsh, Helen O'Neill, Will Finley, Oliver Duke-Williams, Adam Farquhar, and Martin Zaltz Austwick.
Abstract http://dh2016.adho.org/abstracts/2584
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13DataDryad
Presentation by Elena Zudilova-Seinstra on Elsevier's work on data and the article of the future and open data given at the Now and Future of Data Publishing Symposium, 22 May 2013, Oxford, UK
This work presents a data architecture based on semantic web technologies that support to the inclusion of open materials in massive online courses. The framework provides transparent access to RDF data sources for Open Educational Resources stored in OpenCourseWare repositories.
Speaker(s): Nelson Piedra and Edmundo Tovar
Automated interpretability of linked data ontologies: an evaluation within th...Nuno Freire
Publication and usage of linked data has been highly pursued by cultural heritage institutions and service providers in this domain. Much research and cooperation are taking place in adapting and improving cultural heritage data models for linked data and in defining ontologies and vocabularies, as well as the setting up of services based on linked data. This article presents an evaluation of ontologies and vocabularies published as liked data, which originate from the cultural heritage domain, or are frequently used and linked to in this domain. Our study aims to evaluate their usability by crawlers operating on the web of data, according to specifications and practices of linked data, the Semantic Web and ontology reasoning. We evaluate having in mind the use case of general data consumption applications based on RDF, RDF Schema, OWL, SKOS and linked data’s guidelines. We have evaluated twelve ontologies and vocabularies and identified that four were not fully compliant, and that alignments between ontologies are not included in the definitions of the ontologies. This study contributes to the research of novel services consuming linked data. It also allows to better assess the automation that can be achieved to handle the variety and large volume of linked data, when assessing the viability of new services based on linked data in cultural heritage.
2015 03 19 (EDUCON2015) eMadrid UPM Towards a Learning Analytics Approach for...eMadrid network
2015 03 19 (EDUCON2015) eMadrid UPM Towards a Learning Analytics Approach for Supporting discovery and reuse of OER. An approach based on Social Networks Analysis and Linked Open Data
2013 DataCite Summer Meeting - Elsevier's program to support research data (H...datacite
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
Talk at the World Science Festival at Columbia, June 2, 2017: session on Big Data and Physics: http://www.worldsciencefestival.com/programs/big-data-future-physics/
Presented at the Northern Ohio Technical Services Librarians' meeting, November 22, 2013. Describes why libraries should move toward a linked data future to enable their resources to be discoverable on the open web, and includes lessons learned from developing the eXtensible Catalog at the University of Rochester.
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Keynote presentation delivered at ELAG 2013 in Gent, Belgium, on May 29 2013. Discusses Research Objects and the relationship to work my team has been involved in during the past couple of years: OAI-ORE, Open Annotation, Memento.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
15. Opportunity for
subject-based access
• Studies underline end-users interest in
topical searches, but :
• inter-indexing inconsistency
• cost of manual indexing
• Possibilities and limits of using automated
methods to provide a subject-based access ?
16. Unsupervised machine learning
• Often used for exploratory data
analysis by clustering documents
in very large corpora with
unknown content
• “Distant reading” techniques
within the Digital Humanities
• Two popular methods :
• Topic Modeling (TM)
• Word Embeddings (WE)
17. Case-study on non-supervised ML
• Combination of
• LDA
• Word2Vec
• To create automated links to
Eurovoc per document
18. Corpus
• 24.787 pdf documents, representing 138,3 GB
• Period 1958 -1982, with documents in French,
Dutch, German, Italian, Danish, English and Greek
• Only descriptive metadata available for the fonds
creator
• Little value from a traditional archival perspective
but as an aggregate it offers the possibility to analyse
policy development through time
24. K-parameter
• Small number of topics results in too generic
categories, high number results in topics which
are not sufficiently representative for the corpus
• Depends on what you want :
• cover the entire corpus by making sure
every document is indexed
• or to discover specific semantics …
25. Finding a balance
• Topic “eec regulation council commission
community decision european december
amended article” => 0.31336
• Topic “energy nuclear coal projects gas oil
community power heat fuel ” => 0.03307
29. Topic labeling
• Hulpus et al (2013) & Allahyaria and Kochuta
(2015) use the graph structure of DBPedia to
rank the different label candidates
• But - topics may contain different concepts and
the graph structure of DBPedia as a knowledge
structure is not terribly coherent …
• Our approach : use pre-trained Word2Vec to
spot which terms form semantic clusters and
match those with Eurovoc
31. Topics as concepts
• Usage of W2V to help us detect different
concepts within one topic by making use of the
distance between terms
• For example : “labour, farm, poultry, sheep, pig,
land, family, income, holding, purchased”
• Three concepts within one topic :
• labour, farm, poultry, sheep, pig, land
• family
• income, holding, purchased
32. Reconciliation
• In order to perform the matching with
Eurovoc, we are testing to
• Either focus on the most “centroid” term
from a concept and see how many match
• Use the structure of Eurovoc for decision
making (e.g. pick the term on the deepest
level or which has the most non-
descriptors attached to it)