The development of the Common Framework in Dataverse and the CMDI use case. Building AI/ML based workflow for the prediction and linking concepts from external controlled vocabularies to the CMDI metadata values.
Ontologies, controlled vocabularies and Dataversevty
Presentation on Semantic Web technologies for Dataverse Metadata Working Group running by Institute for Quantitative Social Science (IQSS) of Harvard University.
This document discusses the 5 year evolution of Dataverse, an open source data repository platform. It began as a tool for collaborative data curation and sharing within research teams. Over time, features were added like dataset version control, APIs, and integration with other systems. The document outlines challenges around maintenance and sustainability. It also covers efforts to improve Dataverse's interoperability, such as integrating metadata standards and controlled vocabularies, and making datasets FAIR compliant. The goal is to establish Dataverse as a core component of the European Open Science Cloud by improving areas like software quality, integration with tools, and standardization.
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...vty
This document summarizes an presentation about automating CI/CD testing, installation, and deployment of Dataverse in the European Open Science Cloud. It discusses using Docker and Kubernetes for deployment, a community-driven QA plan using pyDataverse for test automation, and providing quality assurance as a service. The presentation also covers topics like the CESSDA maturity model, integrating Dataverse on Google Cloud, and using serverless computing for some Dataverse applications and services.
The presentation for the W3C Semantic Web in Health Care and Life Sciences community group by Slava Tykhonov, DANS-KNAW, the Royal Netherlands Academy of Arts and Sciences (October 2020). The recording is available https://www.youtube.com/watch?v=G9oiyNM_RHc
CLARIN CMDI use case and flexible metadata schemes vty
Presentation for CLARIAH IG Linked Open Data on the latest developments for Dataverse FAIR data repository. Building SEMAF workflow with external controlled vocabularies support and Semantic API. Using the theory of inventive problem solving TRIZ for the further innovation in Linked Data.
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...vty
Presentation at ISKO Knowledge Organisation Research Observatory. RESEARCH REPOSITORIES AND DATAVERSE: NEGOTIATING METADATA, VOCABULARIES AND DOMAIN NEEDS
Ontologies, controlled vocabularies and Dataversevty
Presentation on Semantic Web technologies for Dataverse Metadata Working Group running by Institute for Quantitative Social Science (IQSS) of Harvard University.
This document discusses the 5 year evolution of Dataverse, an open source data repository platform. It began as a tool for collaborative data curation and sharing within research teams. Over time, features were added like dataset version control, APIs, and integration with other systems. The document outlines challenges around maintenance and sustainability. It also covers efforts to improve Dataverse's interoperability, such as integrating metadata standards and controlled vocabularies, and making datasets FAIR compliant. The goal is to establish Dataverse as a core component of the European Open Science Cloud by improving areas like software quality, integration with tools, and standardization.
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...vty
This document summarizes an presentation about automating CI/CD testing, installation, and deployment of Dataverse in the European Open Science Cloud. It discusses using Docker and Kubernetes for deployment, a community-driven QA plan using pyDataverse for test automation, and providing quality assurance as a service. The presentation also covers topics like the CESSDA maturity model, integrating Dataverse on Google Cloud, and using serverless computing for some Dataverse applications and services.
The presentation for the W3C Semantic Web in Health Care and Life Sciences community group by Slava Tykhonov, DANS-KNAW, the Royal Netherlands Academy of Arts and Sciences (October 2020). The recording is available https://www.youtube.com/watch?v=G9oiyNM_RHc
CLARIN CMDI use case and flexible metadata schemes vty
Presentation for CLARIAH IG Linked Open Data on the latest developments for Dataverse FAIR data repository. Building SEMAF workflow with external controlled vocabularies support and Semantic API. Using the theory of inventive problem solving TRIZ for the further innovation in Linked Data.
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...vty
Presentation at ISKO Knowledge Organisation Research Observatory. RESEARCH REPOSITORIES AND DATAVERSE: NEGOTIATING METADATA, VOCABULARIES AND DOMAIN NEEDS
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataversevty
This presentation is about external CVs support in Dataverse, Open Source data repository. Data Archiving and Networked Services (DANS-KNAW) decided to use Dataverse as a basic technology to build Data Stations and provide FAIR data services for various Dutch research communities.
Technical integration of data repositories status and challengesvty
This document discusses technical integration of data repositories, including:
- Previous integration initiatives focused on metadata integration using OAI-PMH and ResourceSync protocols, as well as aggregators like OpenAIRE.
- Challenges to integration include different levels of software/service maturity, maintenance of distributed applications, and use of common standards and vocabularies.
- Potential integration efforts could focus on improving FAIRness, metadata/data flexibility, and connections between repositories, software, and computing resources to better enable reuse of EOSC data and services.
Presentation for CLARIAH IG Linked Open Data on the latest developments for Dataverse FAIR data repository. Building SEMAF workflow with external controlled vocabularies support and Semantic API.
Controlled vocabularies and ontologies in Dataverse data repositoryvty
This document discusses supporting external controlled vocabularies in Dataverse. It proposes implementing a JavaScript interface to allow linking metadata fields to terms from external vocabularies accessed via SKOSMOS APIs. Several challenges are identified, such as applying support to any field, backward compatibility, and ensuring vocabularies come from authoritative sources. Caching concepts and linking dataset files directly to terms are also proposed to improve interoperability.
Dataverse can be deployed using Docker containers to improve maintainability and portability. The document discusses how Docker can isolate applications and their dependencies into portable containers. It provides an example of deploying Dataverse as a set of microservices within Docker containers. Instructions are included on building Docker images, running containers, and managing the containers and images through commands and tools like Docker Desktop, Docker Hub, and Docker Compose.
External controlled vocabularies support in Dataversevty
This presentation discusses adding support for external controlled vocabularies to the Dataverse data repository platform. It describes how ontologies like SKOS can be used to represent vocabularies and allow linking metadata fields in Dataverse to terms. The presentation proposes developing a Semantic Gateway plugin for Dataverse that would allow browsing and linking to external vocabularies hosted in the SKOSMOS framework via its API. This could improve metadata by allowing standardized, linked terms and help make data more FAIR.
Building COVID-19 Museum as Open Science Projectvty
This document discusses building a COVID-19 Museum as an open science project. It describes the speaker's background working on various data management projects. It discusses moving towards open science and sharing data according to FAIR principles. It outlines the Time Machine project for digitizing historical documents and its approach to data management. The rest of the document discusses using the Dataverse platform to build repositories, linking metadata to ontologies, using tools like Weblate for translations, and exploring the use of artificial intelligence and machine learning to enhance metadata and facilitate human-in-the-loop review processes.
Building collaborative Machine Learning platform for Dataverse network. Lecture by Slava Tykhonov (DANS-KNAW, the Netherlands), DANS seminar series, 29.03.2022
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Andrea Scharnhorst
Presentation given at ISKO UK: research observatory, November 24, 2021
RESEARCH REPOSITORIES AND DATAVERSE: NEGOTIATING METADATA, VOCABULARIES AND DOMAIN NEEDS
Vyacheslav Tykhonov, Jerry de Vries, Eko Indarto, Femmy Admiraal, Mike Priddy, and Andrea Scharnhorst: Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the DANS EASY Research Data Repository
Abstract:
The development of metadata schemes in data repositories (and other content providers) has always been a process of negotiation between the needs of the designated user communities and the content of the collection on the one side and standards developed in the field. Automatisation has both enabled and enforced standardisation and alignment of metadata schemes (see as an example). But, while designated user communities turned from being local users to global ones (due to web services), their specific needs have not vanished. Technology offers possibilities to give the aforementioned negotiation a new form. In this presentation, we present the Dataverse platform, used by many data repositories. We show - using the case of the CMDI metadata and the CLARIN (Common Language Resources and Technology Infrastructure)community - how the Dataverse common core set of metadata called Citation Block can be extended with custom fields defined as a discipline specific metadata block. In particular, we show how these custom fields can be connected to a distributed network of authoritative controlled vocabularies. So, that at the end semantic search is possible. The presentation highlights opportunities and challenges, based on our own experiences. Related work has been presented at the CLARIN Annual Conference 2021 (see Proceedings).
Flexible metadata schemes for research data repositories - Clarin Conference...Vyacheslav Tykhonov
The development of the Common Framework in Dataverse and the CMDI use case. Building AI/ML based workflow for the prediction and linking concepts from external controlled vocabularies to the CMDI metadata values.
Running Dataverse repository in the European Open Science Cloud (EOSC)vty
The document discusses Dataverse, an open source data repository software. It summarizes that Dataverse was developed by Harvard University, has a large community and development team, and is used by many countries as a data repository infrastructure. It then describes the SSHOC Dataverse project which aims to create a multilingual, standardized, and reusable open data infrastructure across several European countries. Finally, it notes that Dataverse is a reliable cloud service that enables FAIR data sharing and can be easily deployed by research organizations.
SSHOC Dataverse in the European Open Science Cloudvty
This project summary covers the SSHOC project which aims to create a social sciences and humanities section of the European Open Science Cloud by maximizing data reuse through open science principles. The project will interconnect existing and new infrastructures through a clustered cloud, establish governance for SSH-EOSC, and provide a research data repository service for SSH institutions through further developing the Dataverse platform on EOSC. The project involves 47 partners across 20 beneficiaries and 27 linked third parties with a budget of €14,455,594.08 over 40 months to achieve these objectives.
The document summarizes key aspects of the gCube architecture and its approaches to enabling interoperability across heterogeneous resources and communities. gCube aims to hide heterogeneity by providing unified access to diverse resources, and embrace heterogeneity by supporting multiple protocols and models. It utilizes various approaches like blackboard communication, wrappers, and adapters to achieve interoperability for resource discovery, content access, data discovery, process execution, and security. Applications like AquaMaps and time series management are also discussed.
The document discusses semantic mapping in CLARIN Component Metadata Infrastructure (CMDI). CMDI allows flexible yet semantically interoperable metadata descriptions through the use of explicit schemas and semantic registries like ISOcat and RelationRegistry. These registries define concepts and relationships that can be shared across metadata profiles and elements. Semantic mapping helps achieve recall and disambiguation in metadata searches across the diverse set of CMDI profiles and components.
The document discusses the goals and major specifications of the DCMI Architecture Forum. It aims to document the DCMI metadata framework, develop technical specifications, and provide feedback on technical issues. Major specifications discussed include the DCMI Abstract Model, expressions for expressing DCMI metadata in different formats like RDF and XML, and the Singapore Framework for DC Application Profiles. It also discusses different levels of interoperability and introduces Description Set Profiles as a way to formally represent the constraints of a Dublin Core Application Profile.
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataversevty
This presentation is about external CVs support in Dataverse, Open Source data repository. Data Archiving and Networked Services (DANS-KNAW) decided to use Dataverse as a basic technology to build Data Stations and provide FAIR data services for various Dutch research communities.
Technical integration of data repositories status and challengesvty
This document discusses technical integration of data repositories, including:
- Previous integration initiatives focused on metadata integration using OAI-PMH and ResourceSync protocols, as well as aggregators like OpenAIRE.
- Challenges to integration include different levels of software/service maturity, maintenance of distributed applications, and use of common standards and vocabularies.
- Potential integration efforts could focus on improving FAIRness, metadata/data flexibility, and connections between repositories, software, and computing resources to better enable reuse of EOSC data and services.
Presentation for CLARIAH IG Linked Open Data on the latest developments for Dataverse FAIR data repository. Building SEMAF workflow with external controlled vocabularies support and Semantic API.
Controlled vocabularies and ontologies in Dataverse data repositoryvty
This document discusses supporting external controlled vocabularies in Dataverse. It proposes implementing a JavaScript interface to allow linking metadata fields to terms from external vocabularies accessed via SKOSMOS APIs. Several challenges are identified, such as applying support to any field, backward compatibility, and ensuring vocabularies come from authoritative sources. Caching concepts and linking dataset files directly to terms are also proposed to improve interoperability.
Dataverse can be deployed using Docker containers to improve maintainability and portability. The document discusses how Docker can isolate applications and their dependencies into portable containers. It provides an example of deploying Dataverse as a set of microservices within Docker containers. Instructions are included on building Docker images, running containers, and managing the containers and images through commands and tools like Docker Desktop, Docker Hub, and Docker Compose.
External controlled vocabularies support in Dataversevty
This presentation discusses adding support for external controlled vocabularies to the Dataverse data repository platform. It describes how ontologies like SKOS can be used to represent vocabularies and allow linking metadata fields in Dataverse to terms. The presentation proposes developing a Semantic Gateway plugin for Dataverse that would allow browsing and linking to external vocabularies hosted in the SKOSMOS framework via its API. This could improve metadata by allowing standardized, linked terms and help make data more FAIR.
Building COVID-19 Museum as Open Science Projectvty
This document discusses building a COVID-19 Museum as an open science project. It describes the speaker's background working on various data management projects. It discusses moving towards open science and sharing data according to FAIR principles. It outlines the Time Machine project for digitizing historical documents and its approach to data management. The rest of the document discusses using the Dataverse platform to build repositories, linking metadata to ontologies, using tools like Weblate for translations, and exploring the use of artificial intelligence and machine learning to enhance metadata and facilitate human-in-the-loop review processes.
Building collaborative Machine Learning platform for Dataverse network. Lecture by Slava Tykhonov (DANS-KNAW, the Netherlands), DANS seminar series, 29.03.2022
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Andrea Scharnhorst
Presentation given at ISKO UK: research observatory, November 24, 2021
RESEARCH REPOSITORIES AND DATAVERSE: NEGOTIATING METADATA, VOCABULARIES AND DOMAIN NEEDS
Vyacheslav Tykhonov, Jerry de Vries, Eko Indarto, Femmy Admiraal, Mike Priddy, and Andrea Scharnhorst: Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the DANS EASY Research Data Repository
Abstract:
The development of metadata schemes in data repositories (and other content providers) has always been a process of negotiation between the needs of the designated user communities and the content of the collection on the one side and standards developed in the field. Automatisation has both enabled and enforced standardisation and alignment of metadata schemes (see as an example). But, while designated user communities turned from being local users to global ones (due to web services), their specific needs have not vanished. Technology offers possibilities to give the aforementioned negotiation a new form. In this presentation, we present the Dataverse platform, used by many data repositories. We show - using the case of the CMDI metadata and the CLARIN (Common Language Resources and Technology Infrastructure)community - how the Dataverse common core set of metadata called Citation Block can be extended with custom fields defined as a discipline specific metadata block. In particular, we show how these custom fields can be connected to a distributed network of authoritative controlled vocabularies. So, that at the end semantic search is possible. The presentation highlights opportunities and challenges, based on our own experiences. Related work has been presented at the CLARIN Annual Conference 2021 (see Proceedings).
Flexible metadata schemes for research data repositories - Clarin Conference...Vyacheslav Tykhonov
The development of the Common Framework in Dataverse and the CMDI use case. Building AI/ML based workflow for the prediction and linking concepts from external controlled vocabularies to the CMDI metadata values.
Running Dataverse repository in the European Open Science Cloud (EOSC)vty
The document discusses Dataverse, an open source data repository software. It summarizes that Dataverse was developed by Harvard University, has a large community and development team, and is used by many countries as a data repository infrastructure. It then describes the SSHOC Dataverse project which aims to create a multilingual, standardized, and reusable open data infrastructure across several European countries. Finally, it notes that Dataverse is a reliable cloud service that enables FAIR data sharing and can be easily deployed by research organizations.
SSHOC Dataverse in the European Open Science Cloudvty
This project summary covers the SSHOC project which aims to create a social sciences and humanities section of the European Open Science Cloud by maximizing data reuse through open science principles. The project will interconnect existing and new infrastructures through a clustered cloud, establish governance for SSH-EOSC, and provide a research data repository service for SSH institutions through further developing the Dataverse platform on EOSC. The project involves 47 partners across 20 beneficiaries and 27 linked third parties with a budget of €14,455,594.08 over 40 months to achieve these objectives.
The document summarizes key aspects of the gCube architecture and its approaches to enabling interoperability across heterogeneous resources and communities. gCube aims to hide heterogeneity by providing unified access to diverse resources, and embrace heterogeneity by supporting multiple protocols and models. It utilizes various approaches like blackboard communication, wrappers, and adapters to achieve interoperability for resource discovery, content access, data discovery, process execution, and security. Applications like AquaMaps and time series management are also discussed.
The document discusses semantic mapping in CLARIN Component Metadata Infrastructure (CMDI). CMDI allows flexible yet semantically interoperable metadata descriptions through the use of explicit schemas and semantic registries like ISOcat and RelationRegistry. These registries define concepts and relationships that can be shared across metadata profiles and elements. Semantic mapping helps achieve recall and disambiguation in metadata searches across the diverse set of CMDI profiles and components.
The document discusses the goals and major specifications of the DCMI Architecture Forum. It aims to document the DCMI metadata framework, develop technical specifications, and provide feedback on technical issues. Major specifications discussed include the DCMI Abstract Model, expressions for expressing DCMI metadata in different formats like RDF and XML, and the Singapore Framework for DC Application Profiles. It also discusses different levels of interoperability and introduces Description Set Profiles as a way to formally represent the constraints of a Dublin Core Application Profile.
This presentation was part of the Cloudify and XLAB Research Webinar about DevOps for Data Intensive Applications.
In this webinar we discussed how to leverage automation for your big data applications, using DICE tools based on the Cloudify Open Source Orchestration.
We want to make sure that developers use the time to develop their big data applications and not have to worry about deployment and operations, and have the shortest time to delivery possible.
We also cover using the DICE deployment tools for automated deployment of Spark, Storm, Cassandra or Hadoop.
How to describe a dataset. Interoperability issuesValeria Pesce
Presented by Valeria Pesce during the pre-meeting of the Agricultural Data Interoperability Interest Group (IGAD) of the Research Data Alliance (RDA), held on 21 and 22 September 2015 in Paris at INRA.
HDL - Towards A Harmonized Dataset Model for Open Data PortalsAhmad Assaf
This document discusses the need for a harmonized dataset model for open data portals. It describes existing dataset models like DCAT, VoID, CKAN, and others. It proposes classifying metadata into information groups (resource, tag, group, organization) and types (general, ownership, provenance, etc.). The document outlines a process for harmonizing existing models which includes mapping these information groups and types and examining how extras fields are used across different models and portals. The goal is to define a minimum set of metadata needed to build dataset profiles and enable interoperability.
The document discusses mapping relational databases to RDF with Virtuoso mapping software. It provides advantages of mapping such as enabling unified querying across data sources and leveraging existing relational analytics. Complex mappings may be needed to deal with different schemas and data normalization. Mappings must be carefully defined to avoid incorrect joins. The software can map queries to single optimized SQL statements executable on different databases. Example uses include data integration and OpenLink's customer relationship management system.
ESWC2008 Relational2RDF - Mapping Relational Databases to RDF with OpenLink V...rumito
The document discusses mapping relational databases to RDF with Virtuoso mapping software. It outlines the benefits of mapping such as unified querying across data sources and leveraging existing high performance relational databases. It also describes some challenges like keeping mapped data consistent. The document then provides details on how Virtuoso mapping software works, examples of complex mappings, and use cases where mapping has been applied.
Etosha - Data Asset Manager : Status and road mapDr. Mirko Kämpf
The document provides an overview and roadmap for the first release of an open data asset manager called Etosha MDS. Key points include:
- Etosha MDS will expose metadata about datasets to enable discovery, exploration, and risk analysis of data assets.
- The first release will focus on collecting and exposing schema, statistics, and semantic annotations about datasets using tools like SPARQL and a graph browser.
- Future releases will integrate datasets across Hadoop clusters using a shared semantic knowledge graph and dataset integration layer following the data as a service paradigm.
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...Gezim Sejdiu
Over the past decade, vast amounts of machine-readable structured information have become available through the automation of research processes as well as the increasing popularity of knowledge graphs and semantic technologies.
A major and yet unsolved challenge that research faces today is to perform scalable analysis of large scale knowledge graphs in order to facilitate applications like link prediction, knowledge base completion, and question answering.
Most machine learning approaches, which scale horizontally (i.e. can be executed in a distributed environment) work on simpler feature vector based input rather than more expressive knowledge structures.
On the other hand, the learning methods which exploit the expressive structures, e.g. Statistical Relational Learning and Inductive Logic Programming approaches, usually do not scale well to very large knowledge bases owing to their working complexity.
This talk gives an overview of the ongoing project Semantic Analytics Stack (SANSA) which aims to bridge this research gap by creating an out of the box library for scalable, in-memory, structured learning.
Buildvoc Introduction to linked data digital construction week 2018Phil Stacey ICIOB
This document introduces linked data and its applications for the building industry. It defines linked data as a set of best practices for publishing structured data on the web using URIs, describes key linked data concepts like triple stores and namespaces, and outlines benefits such as compatibility with other technologies and the ability to dynamically link different apps and data. It provides an example of a controlled construction vocabulary developed with linked data principles and demonstrates several linked data applications and tools.
Decentralised identifiers and knowledge graphs vty
Building an Operating System for Open Science: data integration challenges, Dataverse data repository and knowledge graphs. Lecture by Slava Tykhonov, DANS-KNAW, for the Journées Scientifiques de Rochebrune 2023 (JSR'23).
Tutorial Workgroup - Model versioning and collaborationPascalDesmarets1
Hackolade Studio has native integration with Git repositories to provide state-of-the-art collaboration, versioning, branching, conflict resolution, peer review workflows, change tracking and traceability. Mostly, it allows to co-locate data models and schemas with application code, and further integrate with DevOps CI/CD pipelines as part of our vision for Metadata-as-Code.
Co-located application code and data models provide the single source-of-truth for business and technical stakeholders.
ACM TechTalks : Apache Arrow and the Future of Data FramesWes McKinney
Wes McKinney gave a talk on Apache Arrow and the future of data frames. He discussed how Arrow aims to standardize columnar data formats and reduce inefficiencies in data processing. It defines an efficient binary format for transferring data between systems and programming languages. As more tools support Arrow natively, it will become more efficient to process data directly in Arrow format rather than converting between data structures. Arrow is gaining adoption in popular data tools like Spark, BigQuery, and InfluxDB to improve performance.
Expressing Concept Schemes & Competency Frameworks in CTDLCredential Engine
This presentation is focused on how the Credential Engine can access 3rd party resource data stores and recipes for mapping and publishing competency frameworks as Linked Data.
Knowledge Graph Conference 2021
Semantic MediaWiki (SMW), which was introduced as early as in 2006, has since gone on to establish a vital community and is currently one of the few semantic wiki solutions still in existence. SMW is an extension of MediaWiki, the software used for Wikipedia and many other projects, resulting in a largely sustainable codebase and ecosystem. There are many reasons why SMW should not be overlooked by the knowledge graph community:
SMW is capable of directly connecting to several triple stores (Blazegraph, Virtuoso, Jena), which is why it can be considered an interface for entering data into knowledge graphs.
SMW can use its internal relational database (or ElasticSearch), enabling users to build simple knowledge graphs without in-depth knowledge about triple stores.
SMW has the built-in capability of exporting to RDF including building complete RDF data dumps that can be imported into existing knowledge graphs.
SMW has the capability to reuse existing ontologies by importing vocabularies and providing unique identifiers.
The explicit semantic content of Semantic MediaWiki is formally interpreted in the OWL DL ontology language and is made available in XML/RDF format.
A simple internal query language is available to query the internal knowledge graph from within SMW, without the requirement of having a SPARQL endpoint. However, extensions for implementing SPARQL in SMW are available as well.
SMW has the capability to enable data curation for experienced users responsible for the ontology as well as simple form-based input for regular users that can easily populate the KG with data.
There are several approaches to visualizing data in SMW, thus making the knowledge graph visible and interactive.
Implementing custom ontologies in SMW is quite easy, everything is built-in wiki pages (e.g. definition of properties and datatypes, forms and templates).
SMW has low barriers to implementation as it is a clean extension to MediaWiki, which is PHP software running on regular web hosts.
In the talk, I will give an overview of the mentioned aspects and highlight some main differences to Wikibase – which is an alternative approach for managing structured data in MediaWiki – as well as the current limitations of SMW.
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...AbdullaAlAsif1
The pygmy halfbeak Dermogenys colletei, is known for its viviparous nature, this presents an intriguing case of relatively low fecundity, raising questions about potential compensatory reproductive strategies employed by this species. Our study delves into the examination of fecundity and the Gonadosomatic Index (GSI) in the Pygmy Halfbeak, D. colletei (Meisner, 2001), an intriguing viviparous fish indigenous to Sarawak, Borneo. We hypothesize that the Pygmy halfbeak, D. colletei, may exhibit unique reproductive adaptations to offset its low fecundity, thus enhancing its survival and fitness. To address this, we conducted a comprehensive study utilizing 28 mature female specimens of D. colletei, carefully measuring fecundity and GSI to shed light on the reproductive adaptations of this species. Our findings reveal that D. colletei indeed exhibits low fecundity, with a mean of 16.76 ± 2.01, and a mean GSI of 12.83 ± 1.27, providing crucial insights into the reproductive mechanisms at play in this species. These results underscore the existence of unique reproductive strategies in D. colletei, enabling its adaptation and persistence in Borneo's diverse aquatic ecosystems, and call for further ecological research to elucidate these mechanisms. This study lends to a better understanding of viviparous fish in Borneo and contributes to the broader field of aquatic ecology, enhancing our knowledge of species adaptations to unique ecological challenges.
Or: Beyond linear.
Abstract: Equivariant neural networks are neural networks that incorporate symmetries. The nonlinear activation functions in these networks result in interesting nonlinear equivariant maps between simple representations, and motivate the key player of this talk: piecewise linear representation theory.
Disclaimer: No one is perfect, so please mind that there might be mistakes and typos.
dtubbenhauer@gmail.com
Corrected slides: dtubbenhauer.com/talks.html
Immersive Learning That Works: Research Grounding and Paths ForwardLeonel Morgado
We will metaverse into the essence of immersive learning, into its three dimensions and conceptual models. This approach encompasses elements from teaching methodologies to social involvement, through organizational concerns and technologies. Challenging the perception of learning as knowledge transfer, we introduce a 'Uses, Practices & Strategies' model operationalized by the 'Immersive Learning Brain' and ‘Immersion Cube’ frameworks. This approach offers a comprehensive guide through the intricacies of immersive educational experiences and spotlighting research frontiers, along the immersion dimensions of system, narrative, and agency. Our discourse extends to stakeholders beyond the academic sphere, addressing the interests of technologists, instructional designers, and policymakers. We span various contexts, from formal education to organizational transformation to the new horizon of an AI-pervasive society. This keynote aims to unite the iLRN community in a collaborative journey towards a future where immersive learning research and practice coalesce, paving the way for innovative educational research and practice landscapes.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
Current Ms word generated power point presentation covers major details about the micronuclei test. It's significance and assays to conduct it. It is used to detect the micronuclei formation inside the cells of nearly every multicellular organism. It's formation takes place during chromosomal sepration at metaphase.
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...Advanced-Concepts-Team
Presentation in the Science Coffee of the Advanced Concepts Team of the European Space Agency on the 07.06.2024.
Speaker: Diego Blas (IFAE/ICREA)
Title: Gravitational wave detection with orbital motion of Moon and artificial
Abstract:
In this talk I will describe some recent ideas to find gravitational waves from supermassive black holes or of primordial origin by studying their secular effect on the orbital motion of the Moon or satellites that are laser ranged.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
When I was asked to give a companion lecture in support of ‘The Philosophy of Science’ (https://shorturl.at/4pUXz) I decided not to walk through the detail of the many methodologies in order of use. Instead, I chose to employ a long standing, and ongoing, scientific development as an exemplar. And so, I chose the ever evolving story of Thermodynamics as a scientific investigation at its best.
Conducted over a period of >200 years, Thermodynamics R&D, and application, benefitted from the highest levels of professionalism, collaboration, and technical thoroughness. New layers of application, methodology, and practice were made possible by the progressive advance of technology. In turn, this has seen measurement and modelling accuracy continually improved at a micro and macro level.
Perhaps most importantly, Thermodynamics rapidly became a primary tool in the advance of applied science/engineering/technology, spanning micro-tech, to aerospace and cosmology. I can think of no better a story to illustrate the breadth of scientific methodologies and applications at their best.
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...Sérgio Sacani
Context. With a mass exceeding several 104 M⊙ and a rich and dense population of massive stars, supermassive young star clusters
represent the most massive star-forming environment that is dominated by the feedback from massive stars and gravitational interactions
among stars.
Aims. In this paper we present the Extended Westerlund 1 and 2 Open Clusters Survey (EWOCS) project, which aims to investigate
the influence of the starburst environment on the formation of stars and planets, and on the evolution of both low and high mass stars.
The primary targets of this project are Westerlund 1 and 2, the closest supermassive star clusters to the Sun.
Methods. The project is based primarily on recent observations conducted with the Chandra and JWST observatories. Specifically,
the Chandra survey of Westerlund 1 consists of 36 new ACIS-I observations, nearly co-pointed, for a total exposure time of 1 Msec.
Additionally, we included 8 archival Chandra/ACIS-S observations. This paper presents the resulting catalog of X-ray sources within
and around Westerlund 1. Sources were detected by combining various existing methods, and photon extraction and source validation
were carried out using the ACIS-Extract software.
Results. The EWOCS X-ray catalog comprises 5963 validated sources out of the 9420 initially provided to ACIS-Extract, reaching a
photon flux threshold of approximately 2 × 10−8 photons cm−2
s
−1
. The X-ray sources exhibit a highly concentrated spatial distribution,
with 1075 sources located within the central 1 arcmin. We have successfully detected X-ray emissions from 126 out of the 166 known
massive stars of the cluster, and we have collected over 71 000 photons from the magnetar CXO J164710.20-455217.
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
aziz sancar nobel prize winner: from mardin to nobel
Flexible metadata schemes for research data repositories - CLARIN Conference'21
1. Flexible metadata schemes for research data repositories
The Common Framework in Dataverse and the CMDI use case
Slava Tykhonov
Jerry de Vries
Eko Indarto
Andrea Scharnhorst
Femmy Admiraal
(DANS-KNAW)
CLARIN annual conference, 27.09.2021
2. Overall goal for the CMDI core metadata task
The goal mentioned in CMDI strategy 2019-2020: "Ready-made, good quality
profiles & components suitable for common use cases and resource types".
DataCite has three types for metadata elements: mandatory, recommended,
optional, how to distinguish CMDI core components for different CLARIN centers?
We are part of the specific CMDI task for the design and implementation of
CLARIN core metadata components and profiles, and the use of FAIR
vocabularies within CLARIN metadata.
3.
4. 5 challenges for CMDI
Challenge 1: A proposal of a core set of CMDI metadata as recommendation
Challenge 2: Extraction of CMDI metadata and transform and load the metadata fields
into the Dataverse Core set of metadata
Challenge 3: Workflow for prediction and linking concepts from external controlled
vocabularies to the CMDI metadata values
Challenge 4: Extension of the Common Framework with support of FAIR controlled
vocabularies to create FAIR metadata
Challenge 5: Extension of the export functionality of Dataverse to export deposited
CMDI metadata back to the original CMDI format
5. Overall goals for DANS
● CMDI metadata archived in EASY TDR and marked as CLARIN collection
with very limited Dublin Core based metadata schema (EDM), curation and
further reuse isn’t possible
● DANS wants to run Data Station with CMDI metadata schema suitable for
different CLARIN use cases
● the long term goal of DANS is to make CMDI datasets harvestable and
approachable, and create an interoperability layer with external controlled
vocabularies (FAIR Data Point)
6. DANS Data Stations - Future DANS Data Services
Dataverse is API based data platform and a key framework for Open Innovation!
7. Semantic interoperability on the infrastructure level
Dataverse Semantic API in release 5.6: https://github.com/IQSS/dataverse/releases/tag/v5.6
“Dataset metadata can be retrieved, set, and updated using a new, flatter JSON-LD format -
following the format of an OAI-ORE export (RDA-conformant Bags), allowing for easier transfer of
metadata to/from other systems (i.e. without needing to know Dataverse's metadata block and field
storage architecture). This new API also allows for the update of terms metadata“.
External controlled vocabularies support is being developed by DANS in SSHOC project and
will be integrated in Dataverse core in the future releases.
Proposal: https://docs.google.com/document/d/1txdcFuxskRx_tLsDQ7KKLFTMR_r9IBhorDu3V_r445w/
Interfaces: http://github.com/gdcc/dataverse-external-vocab-support
Integrations: Wikidata, ORCID, MeSH, Skosmos vocabularies
8. SEMAF: A Proposal for a Flexible Semantic Mapping Framework
Proposal: https://zenodo.org/record/4651421#.YT9lyC8RpZI
POC: https://github.com/Dans-labs/semaf-poc
13. Challenge 1: CMDI core metadata proposal
Source: Core metadata components design for use cases
14. DANS CMDI metadata generator
● CMDI fields generated automatically by our tool from files deposited in EASY
● CMDI metadata model published as TSV files and uploaded to Dataverse
● Convertor can extract and show the hierarchy of all fields:
17. CMDI metadata model in Dataverse
Obvious conflict: Dataverse hierarchy limitation, CMDI has no limits of hierarchies
Is it all about
relationships? =>
19. Coming close to the implementation
1. Use Data Catalog Vocabulary (DCAT) mappings for CMDI metadata fields
2. Simple Knowledge Organization System (SKOS) to model a thesauri-like
resources with simple skos:broader, skos:narrower and skos:related
properties
3. Load CMDI properties and attributes and build a Knowledge Graph out of all
elements
4. Enrich the Knowledge Graph with concept URIs from various controlled
vocabularies like Skosmos hosted or Wikidata
5. Use different format data-serialization formats suitable for the integration with
different systems. For example, json-ld suitable for Dataverse, turtle for Jena
Fuseki, RDF for LoD frameworks
20. Introduction of Data Catalog Vocabulary (DCAT)
Source: W3C DCAT recommendation
DCAT defines three main
classes:
● dcat:Catalog
represents the
catalog
● dcat:Dataset
represents a dataset
in a catalog.
● dcat:Distribution
represents an
accessible form of a
dataset
DCAT makes extensive
use of terms of RDF,
Dublin Core, SKOS, and
other vocabs!
21. Simple Knowledge Organization System (SKOS)
SKOS models a thesauri-like resources:
- skos:Concepts with preferred labels and alternative labels (synonyms) attached to them
(skos:prefLabel, skos:altLabel).
- skos:Concept can be related with skos:broader, skos:narrower and skos:related properties.
- terms and concepts could have more than one broader term and concept.
SKOS allows to create a semantic layer on top of objects, a network with statements and relationships.
A major difference of SKOS is logical “is-a hierarchies”. In thesauri the hierarchical relation can represent
anything from “is-a” to “part-of”.
21
22. A complex CMDI fragment
Some conclusions:
● Top-level concepts (CMDI
components) can share the same
concepts (called CMDI components)
● Relations between concepts define
metadata schema
● Disambiguation of concepts is
complicated
● Multilingual components have
language indication (for example,
keywords in Dutch)
● Hierarchy defined by semantics
24. Dataverse datasetfield API
curl http://localhost:8080/api/admin/datasetfield/title To do list for Dataverse core:
● add TermURI for
metadata fields (DC)
● show external
controlled vocabularies
available for the
specific field
● add multilingual
support with ‘lang’
parameter
30. Semantic Gateway lookup API
Scenario: when user selects vocabulary and search for term, API will get filled
values and returning back the list of concepts in the standardized format:
GET /?lang=language&vocab=vocabulary&term=keyword
examples:
GET /?lang=en&vocab=unesco&query=fam
GET /?vocab=mesh&query=sars
32. CMDI data model and namespaces
Default namespace added in Semantic Gateway for CMDI schema to keep all relationships
between top-level concepts (metadata fields) in the knowledge graph:
ns.dataverse.org/cmdi_component/cmdi_term
However, a component or element in CMDI has a unique name among its siblings, so:
Source: M. Windhouwer, E. Indarto, D. Broeder. CMD2RDF: Building a Bridge from CLARIN to Linked Open Data
33. Adding component-specific URIs in SKOS
CMDI Component Registry was created for registered Components/Profiles
Example path in CMDI:
/CMD/Components/corpusProfile/resourceCommonInfo/metadataInfo/metadataCreator/actor
Info/actorType
ns.dataverse.org/cmdi1/metadataCreator skos:broader ns.dataverse.org/cmdi1/actorInfo
or simply: cmdi1:metadataCreator skos:related cmdi1:corpusProfile
CMDI concepts could be linked to the other SKOS concepts on the next step.
34. How can we link CMDI components in SKOS?
Source: CMDI Component Registry
35. Dataverse metadata schema ingested into Graph
Compound keyword field with SKOS
We use SKOS relationships to keep the
hierarchy and relationships between
metadata fields
Other Dataverse schemas: https://github.com/Dans-labs/semaf-client/tree/cmdi/schema
36. CMDI conversion to turtle format through Graphs
Basic workflow:
● using RDFLib to manage
namespaces and keep the
CMDI hierarchy
● crosswalks to map CMDI
fields to DCAT vocabulary
● Serialization to json-ld format
allows to deposit CMDI as a
dataset in Dataverse
● Converted CMDI records
also stored in the triple store
37. CMDI triples stored in Jena Fuseki triple store
Benefits:
● CMDI Graph stored in RDF with all
records fully available as triples
● SPARQL endpoint for complex
queries
● Jena has full text search via
Lucene and Elasticsearch
● Easy to load and retrieve data from
a graph datastore
38. Challenge 3: Automated workflow to link CMDI concepts to URIs
● Can we find appropriate
FAIR controlled
vocabularies for CMDI
fields?
● How to resolve
disambiguation and link
values to URIs?
● Can we link the same
value to a few CVs?
● How to link geocodes for
geospatial values?
46. Enriched metadata published in Dataverse
● We’re not using
XSLT or any other
transformations!
● All operations on
metadata enrichment
are directly in the
Knowledge Graph
● Dataverse Semantic
API allows to import
json-ld representation
of dataset
47. Data analysis and CMDI visualizations with Superset
● Apache Superset integration with Dataverse was created in SSHOC project
● Superset allows to connect to various external data sources or databases
(postgresql, mysql, SOLR, MongoDB, ...). Every new connection considered
as a separate dataset
● about 50 different visualizations available out-of-the-box in Superset (charts,
maps, wordclouds, network graphs, …)
● Dataverse with ingested CMDI metadata could be connected to Superset
directly
48. Visualizing existent countries from CMDI on the map with Superset
We can visualize structured
CMDI geospatial metadata
enriched with iso codes
Only existent countries
supported!
Map can be published as a
widget or created on dashboard
49. Geospatial overview of DANS CMDI datasets
Visual representation of all
locations from DANS CMDI
collection
52. Challenge 4: CMDI extension with support of CVs for FAIR metadata
● We can predict appropriate vocabularies for every field after the running
automated pipeline
● CMDI metadata schema could be extended with new CVs support in
Dataverse
● one CMDI field can be linked to a few CVs in the same time (for example,
Geospatial field to Wikidata and Geonames). User can select appropriate CV
in the deposit form during the metadata creation or edit process.
● We’re adding a configurable process to add or remove CVs in Dataverse
53. Suggestions for the usage of FAIR CVs
● Dutch Digital Heritage Network https://netwerkdigitaalerfgoed.nl
● Skosmos instances, for example, https://bartoc-skosmos.unibas.ch/en/
Skosmos client to access vocabularies https://pypi.org/project/skosmos-client/
● ORCID API to link CMDI records to identifiers of researchers
https://info.orcid.org
● CESSDA CV Service https://vocabularies.cessda.eu
More are coming! https://github.com/CLARIAH/awesome-humanities-
ontologies
54. Example of the CV configuration in Dataverse
Configuration in plugable JavaScript:
● Field cvocDemo connected to “unesco”
controlled vocabulary hosted by
Skosmos
● 4 languages available (en, fr, es, ru)
● js-url pointing to javascript gateway to
read and transform output from
external API endpoint
● every Skosmos concept cached
internally in Dataverse to increase the
sustainability
55. Challenge 5: export from Dataverse metadata back to CMDI
Basic requirements:
Dataverse metadata schema should have CMDI metadata that can be extended
by custom components used by CLARIN centers in the different countries.
Original relationships between fields and concepts should be kept, custom
components should be added to SKOS schema.
Users should be able to download metadata in the original CMDI format without
losing quality.
This work is in progress...
56. Dataverse2cmdi export tool
Tips:
● developed by KNAW HuC
in the collaboration with
DANS
● Intended for the
conversion from Dataverse
json-ld back to cmdi
● using Semantic Mappings
to restore fields from
Dataverse citation block
57. Questions?
Slava Tykhonov (DANS-KNAW)
Jerry de Vries (DANS-KNAW)
Andrea Scharnhorst (DANS-KNAW)
Eko Indarto (DANS-KNAW)
Femmy Admiraal (DANS-KNAW)
Semantic Gateway: https://github.com/Dans-labs/semantic-gateway
SEMAF client: https://github.com/Dans-labs/semaf-client