The document summarizes the experimental project of registering Digital Object Identifiers (DOIs) for research data at the Japan Link Center (JaLC). The project aims to establish workflows for registering DOIs for research data and test the registration of data DOIs. It involves 9 research projects and 14 organizations registering and integrating DOIs for their data through the JaLC system. The project addresses several issues in registering DOIs for dynamic research data, such as data lifecycles, granularity, persistence, and handling changes over time.
The document discusses Japan Link Center's (JaLC) experiment to register DOIs for research data. The experiment aims to establish workflows for registering DOIs for research data using JaLC's system. It involves 9 projects with 14 organizations testing DOI registration for research data. The document outlines several issues in registering DOIs for data, including operations flow, persistent access, granularity, dynamics of data, and quantity of data. It also provides examples of how projects can involve multiple institutions and how data lifecycles differ from literature.
This document proposes an approach called SemTyper for assigning semantic labels from a domain ontology to data attributes in a source. SemTyper uses text similarity and statistical tests to holistically label textual and numeric data, respectively. It was evaluated on museum, city, weather, and flight data and showed improved accuracy over prior approaches while training 250x faster. SemTyper can also handle noisy data and works with any user-selected ontology.
Linked Data for Libraries: Benefits of a Conceptual Shift from Library-Specif...Getaneh Alemu
This presentation (full text paper: http://conference.ifla.org/sites/default/files/files/papers/wlic2012/92-alemu-en.pdf ) provides recommendations for making a conceptual shift from current document-centric to data-centric metadata. The importance of adjusting current library models such as Resource Description and Access (RDA) and Functional Requirements for Bibliographic Records (FRBR) to models based on Linked Data principles is discussed. In relation to technical formats, the paper suggests the need to leapfrog from Machine Readable Cataloguing (MARC) to Resource Description Framework (RDF), without disrupting current library metadata operations.
A scalable architecture for extracting, aligning, linking, and visualizing mu...Craig Knoblock
The document proposes an architecture for extracting, aligning, linking, and visualizing multi-source intelligence data at scale. The architecture uses open source software like Apache Nutch, Karma, ElasticSearch, and Hadoop to extract structured and unstructured data, integrate the data using machine learning, compute similarities, resolve entities, construct a knowledge graph, and allow querying and visualization of the graph. An example scenario of analyzing a country's nuclear capabilities from open sources is provided to illustrate the system.
The document discusses open science and the role of identifiers like DOIs. It describes how research data sharing has become core to open science due to the Internet and digital archives. Researchers now publish their data in addition to papers. Well-managed metadata standards and identifier systems help integrate data across its life cycle from creation to archiving. The DOI system provides persistent links for digital objects and is increasingly used for research data through registration agencies like DataCite.
The document discusses major issues in data mining including mining methodology, user interaction, performance, and data types. Specifically, it outlines challenges of mining different types of knowledge, interactive mining at multiple levels of abstraction, incorporating background knowledge, visualization of results, handling noisy data, evaluating pattern interestingness, efficiency and scalability of algorithms, parallel and distributed mining, and handling relational and complex data types from heterogeneous databases.
The document summarizes the experimental project of registering Digital Object Identifiers (DOIs) for research data at the Japan Link Center (JaLC). The project aims to establish workflows for registering DOIs for research data and test the registration of data DOIs. It involves 9 research projects and 14 organizations registering and integrating DOIs for their data through the JaLC system. The project addresses several issues in registering DOIs for dynamic research data, such as data lifecycles, granularity, persistence, and handling changes over time.
The document discusses Japan Link Center's (JaLC) experiment to register DOIs for research data. The experiment aims to establish workflows for registering DOIs for research data using JaLC's system. It involves 9 projects with 14 organizations testing DOI registration for research data. The document outlines several issues in registering DOIs for data, including operations flow, persistent access, granularity, dynamics of data, and quantity of data. It also provides examples of how projects can involve multiple institutions and how data lifecycles differ from literature.
This document proposes an approach called SemTyper for assigning semantic labels from a domain ontology to data attributes in a source. SemTyper uses text similarity and statistical tests to holistically label textual and numeric data, respectively. It was evaluated on museum, city, weather, and flight data and showed improved accuracy over prior approaches while training 250x faster. SemTyper can also handle noisy data and works with any user-selected ontology.
Linked Data for Libraries: Benefits of a Conceptual Shift from Library-Specif...Getaneh Alemu
This presentation (full text paper: http://conference.ifla.org/sites/default/files/files/papers/wlic2012/92-alemu-en.pdf ) provides recommendations for making a conceptual shift from current document-centric to data-centric metadata. The importance of adjusting current library models such as Resource Description and Access (RDA) and Functional Requirements for Bibliographic Records (FRBR) to models based on Linked Data principles is discussed. In relation to technical formats, the paper suggests the need to leapfrog from Machine Readable Cataloguing (MARC) to Resource Description Framework (RDF), without disrupting current library metadata operations.
A scalable architecture for extracting, aligning, linking, and visualizing mu...Craig Knoblock
The document proposes an architecture for extracting, aligning, linking, and visualizing multi-source intelligence data at scale. The architecture uses open source software like Apache Nutch, Karma, ElasticSearch, and Hadoop to extract structured and unstructured data, integrate the data using machine learning, compute similarities, resolve entities, construct a knowledge graph, and allow querying and visualization of the graph. An example scenario of analyzing a country's nuclear capabilities from open sources is provided to illustrate the system.
The document discusses open science and the role of identifiers like DOIs. It describes how research data sharing has become core to open science due to the Internet and digital archives. Researchers now publish their data in addition to papers. Well-managed metadata standards and identifier systems help integrate data across its life cycle from creation to archiving. The DOI system provides persistent links for digital objects and is increasingly used for research data through registration agencies like DataCite.
The document discusses major issues in data mining including mining methodology, user interaction, performance, and data types. Specifically, it outlines challenges of mining different types of knowledge, interactive mining at multiple levels of abstraction, incorporating background knowledge, visualization of results, handling noisy data, evaluating pattern interestingness, efficiency and scalability of algorithms, parallel and distributed mining, and handling relational and complex data types from heterogeneous databases.
Current metadata landscape in the library world (Getaneh Alemu)Getaneh Alemu
The document summarizes the current metadata landscape in libraries. It discusses what metadata is, existing metadata challenges like growing collections and changing user expectations. It covers common metadata standards like MARC21, Dublin Core and frameworks like FRBR. The document emphasizes that metadata enables functions like search, discovery and organization. It discusses metadata enrichment through user tagging and linking metadata to controlled vocabularies. The future of metadata is seen as enriched, linked, open and filtered to meet changing needs.
Presented for managers & researchers at The Global One Health Initiative of the Ohio State University, Africa Regional Branch in Addis Ababa, Ethiopia (April 24th 2019)
Presentación del Dr. Getaneh Alemu (Solent University, Reino Unido), en el II Congreso de Información, Comunicación e Investigación (CICI 2018) “Metadatos y Organización de la Información”. Facultad de Filosofía y Letras de la Universidad Autónoma de Chihuahua, México. Evento organizado por el Cuerpo Académico 'Estudios de la Información' y el Grupo Disciplinar ‘Información, Lenguaje, Comunicación y Desarrollo Sostenible’. 29 de octubre de 2018.
The EASTER project aims to test and evaluate a range of current automated subject metadata generation tools. It will do so using the Intute digital collection as a testbed, evaluating the tools' usefulness for both cataloguers and end-users. The project will develop a methodology for evaluating such tools and create an enhanced "gold standard" test collection. Initial candidate tools include Temis Categorizer, KEA, TextGarden, TerMine, and others. The tools will be evaluated on their ability to generate subject metadata for veterinary, visual arts and politics domains. Related projects on information extraction from archaeology reports will also inform EASTER's work.
This document discusses metadata and its importance for digital libraries and humanities. It defines metadata as "data about data" that describes resources to help users find, identify and select them. Metadata plays a crucial role in managing the huge amount of digital information and data available. The document advocates for an approach of enriching metadata by allowing both experts and users to contribute, and filtering it through customizable interfaces to meet diverse user needs.
Metadata enriching and discovery at Solent University Library Getaneh Alemu
This document discusses metadata enriching and discovery at Solent University. It begins with introductions and context about how enriched, linked, open and filtered metadata drives resource usage. It then discusses several principles of metadata including sufficiency, necessity, user convenience, representation and standardization. The document outlines how Solent University has enriched its metadata by importing subject headings and authorities. It discusses metadata linking, openness, filtering and usage. Overall it emphasizes the importance of enriching metadata and keeping interfaces simple while maximizing resource discovery and usage.
Sherif Metadata Talk - London (June 25th 2018)Getaneh Alemu
This document summarizes the existing challenges and opportunities in the cataloguing and metadata function of Southampton Solent University. It discusses how the university has shifted to primarily electronic resources and moved to enrich metadata through standards like RDA. It also touches on balancing metadata quality with completeness while avoiding duplication through techniques like WEMI and FRBRization. The future of metadata is discussed as being enriched, linked, open and filtered.
Metadata enriching and filtering for enhanced collection discoverability Getaneh Alemu
The return on investment for academic libraries is chiefly tied to access, usage and impact. Without accurate, consistent and quality metadata on the one hand, and an easy-to-use and effective discovery service on the other, these valuable resources may remain invisible and inaccessible to users. In this talk, Getaneh aims to present four overarching metadata principles, namely: metadata enriching, linking, openness and filtering. And how these ideas help shape the metadata creation and discovery services at Solent University – focusing on the implementation of RDA and FRBR as well as the use of subject headings and authority controls.
The role of metadata for discovery: tips for content providersGetaneh Alemu
This presentation was made on 17th February 2022 at the NISO PLUS 2022 Conference. It offers an overview of IFLA’s LRM (FRBR) tasks, namely finding, identifying, selecting, obtaining, and exploring information resources. It points out that metadata is key for content distribution, visibility, discoverability, accessibility, sales and usage.
https://np22.niso.plus/Category/28a52f1d-a477-43e8-a7dc-abd009383a57
Semantic Metadata Interoperability in Digital LibrariesGetaneh Alemu
This document describes a constructivist grounded theory approach to addressing semantic metadata interoperability issues in digital libraries. It discusses challenges like differing naming conventions, identification practices, and terminology used across systems. Bottom-up, qualitative methods are proposed over top-down standards to account for diverse cultural interpretations. Interviews with librarians, researchers and students revealed that controlled vocabularies often fail to represent local perspectives and that semantic interoperability requires a social constructivist approach.
From the principle of sufficiency and necessity to metadata enrichingGetaneh Alemu
In contrast to the principle of metadata simplicity and sufficiency, the principle of metadata enriching can be considered a departure from traditional cataloguing approaches where the focus was on metadata simplicity. Metadata created and managed following the principle of metadata enriching better responds to users’ needs. Whilst the principle of enriching results in a potential abundance of metadata, the principle of filtering is used to simplify its presentation by enabling a user-centred/focused/led design.
This document discusses text mining and provides an outline of the topic. It defines text mining as the analysis of natural language text data and explains why it is useful given the large amount of unstructured data. The document then describes the basic text mining process, which includes steps like filtering, segmentation, stemming, eliminating excessive words, and clustering. Several applications of text mining are mentioned like call centers, anti-spam, and market intelligence. Challenges of text mining like dealing with unstructured data and large collections of documents are also outlined.
Slides from a webinar presentation organised by ALCTS -A division of the American Library Association - February 19th 2020. http://www.ala.org/alcts/confevents/upcoming/webinar/021920
The return on investment for academic libraries is chiefly tied to access, usage, and impact. Without accurate, consistent, and quality metadata on the one hand, and an easy-to-use and effective discovery service on the other, these valuable resources may remain invisible and inaccessible to users. In this webinar, four overarching metadata principles, namely metadata enriching, linking, openness, and filtering, are presented. In addition, presenters will examine how these ideas help shape the metadata creation and discovery services at Solent University—focusing on the implementation of RDA and FRBR as well as the use of subject authority headings and authority controls.
This document discusses research data management (RDM). It defines research data and describes the RDM lifecycle. Key aspects of RDM include creating data management plans, documenting and organizing data, and ensuring long-term preservation and sharing of data. The document outlines best practices for RDM, such as using appropriate file formats and metadata standards. It also discusses challenges around sensitive data and guidelines for data sharing and citation. The roles libraries can play in supporting RDM are identified, such as developing RDM policies, training researchers, and setting up data repositories.
Georgios Meditskos and Stamatia Dasiopoulou | Question Answering over Pattern...semanticsconference
This document proposes a framework for question answering over pattern-based user models. It combines a frame-based representation of natural language questions with a context-aware, graph-based approach to knowledge extraction. The framework performs semantic analysis of questions, identifies relevant concepts in ontologies, constructs local contexts around those concepts, and ranks contexts to generate answers by matching to the question analysis. An example demonstrates applying the framework to answer the question "How often does Ann like to drink coffee?". The framework aims to support question answering over complex conceptual models and user profiles.
Current metadata landscape in the library world Getaneh AlemuGetaneh Alemu
This workshop was presented at MTSR-2017 (Nov. 27, 2017) in Tallinn, Estonia http://www.mtsr-conf.org/index.php/programme The workshop aims to bring the current metadata landscape in libraries in context, with particular emphasis on emerging theory/principles and best practices covering:
• The theory of enriching and filtering
• Metadata enriching through RDA (Hands on - The RDA Toolkit and implementation of RDA at Southampton Solent University)
• Metadata filtering through FRBR (practical issues that cataloguers face in FRBRising their catalogue)
• Metadata management (metadata quality, authority control and subject headings)
• Metadata systems, tools and applications (practical issues of e-books and database cataloguing)
The document discusses various approaches to information extraction from web documents, including knowledge engineering, machine learning, wrappers, and different IE systems. It analyzes IE systems based on their capabilities, such as their ability to extract from complex objects, different document types, resilience to changes, and degree of automation. The best system is the BYU ontology approach, which has capabilities such as supporting nested data, being resilient and adaptive, and working on semi-structured and unstructured text.
Semantic Web: Technolgies and Applications for Real-WorldAmit Sheth
Amit Sheth and Susie Stephens, "Semantic Web: Technolgies and Applications for Real-World," Tutorial at 2007 World Wide Web Conference, Banff, Canada.
Tutorial discusses technologies and deployed real-world applications through 2007.
Tutorial description at: http://www2007.org/tutorial-T11.php
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...María Poveda Villalón
The document proposes a lightweight methodology called LOT (Linked Open Terms) for developing Linked Data ontologies and vocabularies in a reusable way. The methodology is data-driven and focuses on ontology search, selection, integration, completion and evaluation activities. It provides guidelines for reusing existing terms and linking them according to Linked Data principles while keeping the processes lightweight. The methodology is intended to help domain experts create ontologies and vocabularies for publishing data on the semantic web in an interoperable way without requiring extensive knowledge engineering expertise. Future work involves providing more detailed guidelines, examples, and connecting existing tools to support each step of the methodology.
Current metadata landscape in the library world (Getaneh Alemu)Getaneh Alemu
The document summarizes the current metadata landscape in libraries. It discusses what metadata is, existing metadata challenges like growing collections and changing user expectations. It covers common metadata standards like MARC21, Dublin Core and frameworks like FRBR. The document emphasizes that metadata enables functions like search, discovery and organization. It discusses metadata enrichment through user tagging and linking metadata to controlled vocabularies. The future of metadata is seen as enriched, linked, open and filtered to meet changing needs.
Presented for managers & researchers at The Global One Health Initiative of the Ohio State University, Africa Regional Branch in Addis Ababa, Ethiopia (April 24th 2019)
Presentación del Dr. Getaneh Alemu (Solent University, Reino Unido), en el II Congreso de Información, Comunicación e Investigación (CICI 2018) “Metadatos y Organización de la Información”. Facultad de Filosofía y Letras de la Universidad Autónoma de Chihuahua, México. Evento organizado por el Cuerpo Académico 'Estudios de la Información' y el Grupo Disciplinar ‘Información, Lenguaje, Comunicación y Desarrollo Sostenible’. 29 de octubre de 2018.
The EASTER project aims to test and evaluate a range of current automated subject metadata generation tools. It will do so using the Intute digital collection as a testbed, evaluating the tools' usefulness for both cataloguers and end-users. The project will develop a methodology for evaluating such tools and create an enhanced "gold standard" test collection. Initial candidate tools include Temis Categorizer, KEA, TextGarden, TerMine, and others. The tools will be evaluated on their ability to generate subject metadata for veterinary, visual arts and politics domains. Related projects on information extraction from archaeology reports will also inform EASTER's work.
This document discusses metadata and its importance for digital libraries and humanities. It defines metadata as "data about data" that describes resources to help users find, identify and select them. Metadata plays a crucial role in managing the huge amount of digital information and data available. The document advocates for an approach of enriching metadata by allowing both experts and users to contribute, and filtering it through customizable interfaces to meet diverse user needs.
Metadata enriching and discovery at Solent University Library Getaneh Alemu
This document discusses metadata enriching and discovery at Solent University. It begins with introductions and context about how enriched, linked, open and filtered metadata drives resource usage. It then discusses several principles of metadata including sufficiency, necessity, user convenience, representation and standardization. The document outlines how Solent University has enriched its metadata by importing subject headings and authorities. It discusses metadata linking, openness, filtering and usage. Overall it emphasizes the importance of enriching metadata and keeping interfaces simple while maximizing resource discovery and usage.
Sherif Metadata Talk - London (June 25th 2018)Getaneh Alemu
This document summarizes the existing challenges and opportunities in the cataloguing and metadata function of Southampton Solent University. It discusses how the university has shifted to primarily electronic resources and moved to enrich metadata through standards like RDA. It also touches on balancing metadata quality with completeness while avoiding duplication through techniques like WEMI and FRBRization. The future of metadata is discussed as being enriched, linked, open and filtered.
Metadata enriching and filtering for enhanced collection discoverability Getaneh Alemu
The return on investment for academic libraries is chiefly tied to access, usage and impact. Without accurate, consistent and quality metadata on the one hand, and an easy-to-use and effective discovery service on the other, these valuable resources may remain invisible and inaccessible to users. In this talk, Getaneh aims to present four overarching metadata principles, namely: metadata enriching, linking, openness and filtering. And how these ideas help shape the metadata creation and discovery services at Solent University – focusing on the implementation of RDA and FRBR as well as the use of subject headings and authority controls.
The role of metadata for discovery: tips for content providersGetaneh Alemu
This presentation was made on 17th February 2022 at the NISO PLUS 2022 Conference. It offers an overview of IFLA’s LRM (FRBR) tasks, namely finding, identifying, selecting, obtaining, and exploring information resources. It points out that metadata is key for content distribution, visibility, discoverability, accessibility, sales and usage.
https://np22.niso.plus/Category/28a52f1d-a477-43e8-a7dc-abd009383a57
Semantic Metadata Interoperability in Digital LibrariesGetaneh Alemu
This document describes a constructivist grounded theory approach to addressing semantic metadata interoperability issues in digital libraries. It discusses challenges like differing naming conventions, identification practices, and terminology used across systems. Bottom-up, qualitative methods are proposed over top-down standards to account for diverse cultural interpretations. Interviews with librarians, researchers and students revealed that controlled vocabularies often fail to represent local perspectives and that semantic interoperability requires a social constructivist approach.
From the principle of sufficiency and necessity to metadata enrichingGetaneh Alemu
In contrast to the principle of metadata simplicity and sufficiency, the principle of metadata enriching can be considered a departure from traditional cataloguing approaches where the focus was on metadata simplicity. Metadata created and managed following the principle of metadata enriching better responds to users’ needs. Whilst the principle of enriching results in a potential abundance of metadata, the principle of filtering is used to simplify its presentation by enabling a user-centred/focused/led design.
This document discusses text mining and provides an outline of the topic. It defines text mining as the analysis of natural language text data and explains why it is useful given the large amount of unstructured data. The document then describes the basic text mining process, which includes steps like filtering, segmentation, stemming, eliminating excessive words, and clustering. Several applications of text mining are mentioned like call centers, anti-spam, and market intelligence. Challenges of text mining like dealing with unstructured data and large collections of documents are also outlined.
Slides from a webinar presentation organised by ALCTS -A division of the American Library Association - February 19th 2020. http://www.ala.org/alcts/confevents/upcoming/webinar/021920
The return on investment for academic libraries is chiefly tied to access, usage, and impact. Without accurate, consistent, and quality metadata on the one hand, and an easy-to-use and effective discovery service on the other, these valuable resources may remain invisible and inaccessible to users. In this webinar, four overarching metadata principles, namely metadata enriching, linking, openness, and filtering, are presented. In addition, presenters will examine how these ideas help shape the metadata creation and discovery services at Solent University—focusing on the implementation of RDA and FRBR as well as the use of subject authority headings and authority controls.
This document discusses research data management (RDM). It defines research data and describes the RDM lifecycle. Key aspects of RDM include creating data management plans, documenting and organizing data, and ensuring long-term preservation and sharing of data. The document outlines best practices for RDM, such as using appropriate file formats and metadata standards. It also discusses challenges around sensitive data and guidelines for data sharing and citation. The roles libraries can play in supporting RDM are identified, such as developing RDM policies, training researchers, and setting up data repositories.
Georgios Meditskos and Stamatia Dasiopoulou | Question Answering over Pattern...semanticsconference
This document proposes a framework for question answering over pattern-based user models. It combines a frame-based representation of natural language questions with a context-aware, graph-based approach to knowledge extraction. The framework performs semantic analysis of questions, identifies relevant concepts in ontologies, constructs local contexts around those concepts, and ranks contexts to generate answers by matching to the question analysis. An example demonstrates applying the framework to answer the question "How often does Ann like to drink coffee?". The framework aims to support question answering over complex conceptual models and user profiles.
Current metadata landscape in the library world Getaneh AlemuGetaneh Alemu
This workshop was presented at MTSR-2017 (Nov. 27, 2017) in Tallinn, Estonia http://www.mtsr-conf.org/index.php/programme The workshop aims to bring the current metadata landscape in libraries in context, with particular emphasis on emerging theory/principles and best practices covering:
• The theory of enriching and filtering
• Metadata enriching through RDA (Hands on - The RDA Toolkit and implementation of RDA at Southampton Solent University)
• Metadata filtering through FRBR (practical issues that cataloguers face in FRBRising their catalogue)
• Metadata management (metadata quality, authority control and subject headings)
• Metadata systems, tools and applications (practical issues of e-books and database cataloguing)
The document discusses various approaches to information extraction from web documents, including knowledge engineering, machine learning, wrappers, and different IE systems. It analyzes IE systems based on their capabilities, such as their ability to extract from complex objects, different document types, resilience to changes, and degree of automation. The best system is the BYU ontology approach, which has capabilities such as supporting nested data, being resilient and adaptive, and working on semi-structured and unstructured text.
Semantic Web: Technolgies and Applications for Real-WorldAmit Sheth
Amit Sheth and Susie Stephens, "Semantic Web: Technolgies and Applications for Real-World," Tutorial at 2007 World Wide Web Conference, Banff, Canada.
Tutorial discusses technologies and deployed real-world applications through 2007.
Tutorial description at: http://www2007.org/tutorial-T11.php
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...María Poveda Villalón
The document proposes a lightweight methodology called LOT (Linked Open Terms) for developing Linked Data ontologies and vocabularies in a reusable way. The methodology is data-driven and focuses on ontology search, selection, integration, completion and evaluation activities. It provides guidelines for reusing existing terms and linking them according to Linked Data principles while keeping the processes lightweight. The methodology is intended to help domain experts create ontologies and vocabularies for publishing data on the semantic web in an interoperable way without requiring extensive knowledge engineering expertise. Future work involves providing more detailed guidelines, examples, and connecting existing tools to support each step of the methodology.
Amit Sheth with TK Prasad, "Semantic Technologies for Big Science and Astrophysics", Invited Plenary Presentation, at Earthcube Solar-Terrestrial End-User Workshop, NJIT, Newark, NJ, August 13, 2014.
Like many other fields of Big Science, Astrophysics and Solar Physics deal with the challenges of Big Data, including Volume, Variety, Velocity, and Veracity. There is already significant work on handling volume related challenges, including the use of high performance computing. In this talk, we will mainly focus on other challenges from the perspective of collaborative sharing and reuse of broad variety of data created by multiple stakeholders, large and small, along with tools that offer semantic variants of search, browsing, integration and discovery capabilities. We will borrow examples of tools and capabilities from state of the art work in supporting physicists (including astrophysicists) [1], life sciences [2], material sciences [3], and describe the role of semantics and semantic technologies that make these capabilities possible or easier to realize. This applied and practice oriented talk will complement more vision oriented counterparts [4].
[1] Science Web-based Interactive Semantic Environment: http://sciencewise.info/
[2] NCBO Bioportal: http://bioportal.bioontology.org/ , Kno.e.sis’s work on Semantic Web for Healthcare and Life Sciences: http://knoesis.org/amit/hcls
[3] MaterialWays (a Materials Genome Initiative related project): http://wiki.knoesis.org/index.php/MaterialWays
[4] From Big Data to Smart Data: http://wiki.knoesis.org/index.php/Smart_Data
The document presents a framework for analyzing usage of domain ontologies on the semantic web. It proposes metrics to measure ontology usage, including concept richness, concept usage, and relationship and attribute values. The framework was implemented to analyze usage of ontologies in datasets from companies like Google and Yahoo. The analysis provided insights into ontology usage trends and patterns in the knowledge bases. Ontology usage analysis can help ontology engineers understand usage and evolve ontologies, as well as anticipate available knowledge when developing applications.
Semantic Interoperability Issues and Approaches in the IoT.est Projectiotest
P Barnaghi, Semantic Interoperability Issues and Approaches in the IoT.est Project, at the IERC AC4 Semantic interoperability Workshop (during the IoT-week 2012), Venice, Italy, 19 June 2012
This document discusses using ontologies to make biological and biomedical data more interoperable and FAIR (Findable, Accessible, Interoperable, Reusable). It describes several ontology services and tools provided by EMBL-EBI to help with tasks like annotating data, mapping data to ontologies, searching and accessing ontologies, and publishing structured data. It also uses the example of the BioSamples database to illustrate challenges in working with large, heterogeneous datasets and how ontologies can help address issues like normalizing descriptions and attributes to enable better searching and data integration.
SSONDE is a framework for calculating semantic similarity between ontology instances represented as linked data. It provides an asymmetric similarity score that emphasizes containment relationships between instances. SSONDE operates at the application layer and assumes integration steps like ontology alignment have already occurred. It has been applied to compare researchers based on publications and interests, and habitats based on hosted species. The framework supports configurable similarity contexts and caching to optimize performance on large linked datasets.
Research Objects: more than the sum of the partsCarole Goble
Workshop on Managing Digital Research Objects in an Expanding Science Ecosystem, 15 Nov 2017, Bethesda, USA
https://www.rd-alliance.org/managing-digital-research-objects-expanding-science-ecosystem
Research output is more than just the rhetorical narrative. The experimental methods, computational codes, data, algorithms, workflows, Standard Operating Procedures, samples and so on are the objects of research that enable reuse and reproduction of scientific experiments, and they too need to be examined and exchanged as research knowledge.
A first step is to think of Digital Research Objects as a broadening out to embrace these artefacts or assets of research. The next is to recognise that investigations use multiple, interlinked, evolving artefacts. Multiple datasets and multiple models support a study; each model is associated with datasets for construction, validation and prediction; an analytic pipeline has multiple codes and may be made up of nested sub-pipelines, and so on. Research Objects (http://researchobject.org/) is a framework by which the many, nested and contributed components of research can be packaged together in a systematic way, and their context, provenance and relationships richly described.
The document outlines the plans for a PhD research project on enhancing semantic interoperability among spreadsheets. The research will build upon a previous master's degree which identified construction patterns in spreadsheets and linked labels to ontologies within a single domain. The PhD plans to address limitations of the previous work by considering multiple domains, developing a model to relate elements across spreadsheets, and linking spreadsheet structure to ontologies at the concept level. Key research questions involve defining when spreadsheets share the same purpose, canonical representations among similar spreadsheets, and using representations to predict spreadsheet purpose and domain. The goal is achieving semantic interoperability across spreadsheets.
Talk given by prof. T.K. Prasad at the workshop on Semantics in Geospatial Architectures: Applications and Implementation. The workshop was held from October 28-29, 2013 at Pyle Center (702 Langdon Street, Madison, WI), University of Wisconsin-Madison.
From Open Access to Open Standards, (Linked) Data and CollaborationsSimeon Warner
This document discusses moving from MARC to linked data formats like BIBFRAME. It notes that MARC has limitations like using text where data is needed and limited extensibility. Linked data formats use identifiers rather than names, connect to the web using URIs, and can be extended over time by the community. The LD4L project converted millions of MARC records to BIBFRAME at scale and developed a blacklight search over combined linked data catalogs.
Semantic technologies for the Internet of Things PayamBarnaghi
The document discusses semantic technologies for the Internet of Things. It describes how sensor data in the IoT is time-dependent, continuous, and variable quality. Semantic annotations and machine-interpretable formats like XML and RDF are needed to make the data interoperable. Ontologies provide formal definitions of concepts and relationships in a domain that enable machines to process IoT data and enable autonomous device interactions. The document outlines approaches to semantically describe sensor observations and measurements using XML, RDF graphs, and adding domain concepts and logical rules with ontologies.
Ontologies provide a shared understanding of a domain by formally defining concepts, properties, and relationships. An ontology introduces vocabulary relevant to a domain and specifies the meaning of terms. Ontologies are machine-readable and enable overcoming differences in terminology across complex, distributed applications. Examples include gene ontologies, pharmaceutical drug ontologies, and customer profile ontologies. Semantic technologies use ontologies to provide semantic search, integration, reasoning, and analysis capabilities.
Keynote presentation delivered at ELAG 2013 in Gent, Belgium, on May 29 2013. Discusses Research Objects and the relationship to work my team has been involved in during the past couple of years: OAI-ORE, Open Annotation, Memento.
Resource Description Framework Approach to Data Publication and FederationPistoia Alliance
Bob Stanley, CEO, IO Informatics, explains the utility to RDF as a standard way of defining and redefining data that could have utility in managing life science information.
A Metadata Application Profile for KOS Vocabulary Registries (KOS-AP)Marcia Zeng
Report on the outcomes of the DCMI-NKOS Task Group, which builds on the work done by the NKOS community during the last decade. While we discuss the KOS-AP in the context of KOS registries, the context of microdata should be considered equally important in all aspects.
The document discusses the role of ontologies in linked data. It notes that while semantic web ontologies have been widely applied, linked data has grown rapidly using lightweight or no ontologies. However, ontologies could still provide benefits to linked data by helping integrate and reason over heterogeneous linked data sources. Open issues remain around how to best reuse and modularize ontologies for different linked data applications and domains.
Similar to A Framework for Ontology Usage Analysis (20)
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...Alex Pruden
Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty, is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing post-quantum security.
Paper Link: https://eprint.iacr.org/2024/257
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/temporal-event-neural-networks-a-more-efficient-alternative-to-the-transformer-a-presentation-from-brainchip/
Chris Jones, Director of Product Management at BrainChip , presents the “Temporal Event Neural Networks: A More Efficient Alternative to the Transformer” tutorial at the May 2024 Embedded Vision Summit.
The expansion of AI services necessitates enhanced computational capabilities on edge devices. Temporal Event Neural Networks (TENNs), developed by BrainChip, represent a novel and highly efficient state-space network. TENNs demonstrate exceptional proficiency in handling multi-dimensional streaming data, facilitating advancements in object detection, action recognition, speech enhancement and language model/sequence generation. Through the utilization of polynomial-based continuous convolutions, TENNs streamline models, expedite training processes and significantly diminish memory requirements, achieving notable reductions of up to 50x in parameters and 5,000x in energy consumption compared to prevailing methodologies like transformers.
Integration with BrainChip’s Akida neuromorphic hardware IP further enhances TENNs’ capabilities, enabling the realization of highly capable, portable and passively cooled edge devices. This presentation delves into the technical innovations underlying TENNs, presents real-world benchmarks, and elucidates how this cutting-edge approach is positioned to revolutionize edge AI across diverse applications.
Digital Marketing Trends in 2024 | Guide for Staying AheadWask
https://www.wask.co/ebooks/digital-marketing-trends-in-2024
Feeling lost in the digital marketing whirlwind of 2024? Technology is changing, consumer habits are evolving, and staying ahead of the curve feels like a never-ending pursuit. This e-book is your compass. Dive into actionable insights to handle the complexities of modern marketing. From hyper-personalization to the power of user-generated content, learn how to build long-term relationships with your audience and unlock the secrets to success in the ever-shifting digital landscape.
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
A Framework for Ontology Usage Analysis
1. A Framework for Ontology Usage Analysis
Jamshaid Ashraf
jamshaid.ashraf@gmail.com
Supervisor : Dr Omar Hussain
School of Information Systems, Curtin University, Perth, Western Australia
PhD symposium
ESWC 2012, Heraklion, Crete, Greece (27- 31 May 2012)
3. (Structured) Data focused
Ontologies
[2006 – to data] - LINKED DATA
Linked Data
Data Focused
•Linked Data principles
•Linked Open Data project
•LOD cloud
•RDFa
•RDF data analysis
4. Current state
Ontol
og y
ata
Li n ked d
….. searching less and using more
6. Lack of visibility
- Index such as PingTheSemanticWeb does not provide a detailed
view of ontology usage
- In order to make effective and efficient use of semantic web
data, we need to know which concepts and relationships and
how are being used?
- An insight into the structure, understand the pattern available,
actual use and the intended use
7. Ontology life cycle
Ontology
Ontology Dev. Lifecycle
•Think
•Design
•Develop & evaluate
•Deploy
•Evangelize
•Adoption!
• Measure and analyze
• Learn from it to influence
future thinking and design
9. Benefits of Usage Analysis
(1) Helps in providing usage-based feedback loop to the ontology
maintenance process for a pragmatic conceptual model update
(2) Assist in building data rich interfaces, exploratory search and
exploratory data analysis
(3) Provides erudite insight on the state of semantic structured data based
on prevalent knowledge patterns for the consuming applications
10. Ontology Usage Analysis Framework (OUSAF)
Identification (selection of ontologies)
- Domain Ontology
- Identify candidate ontology(ies) from dataset
Investigation (analysing the use of ontology)
- Usage/population/instantiation
- Co-usability/schema-link graph
Representation (represent the usage analysis )
- Conceptual model to represent ontology usage
- Ontology Usage Catalogue
Utilization (making use of usage analysis )
- Use case implementation
- Publication of ontology usage analysis
11. Metrics for measuring richness
>Concept Richness (CR): Describes the relationship with other
concepts and the number of attributes to describe the
instances
>Relationship Value (RV): Reflects the possible role of an
object property in creating typed relationship between
different concepts
>Attribute Value (RV): Reflects the number of concepts that
have data properties used to provide values to instances
12. Metrics for measuring usage
>Concept Usage (CU): Measures the instantiation of the
concept in the knowledge base
CU(C) = |{t = (s, p, o)| p = rdf:type, o = C}|1
CUH(C) = |{t = (s, p, o)| p = rdf:type, o entailrdfs9(C)}|
>Relationship Usage (RU): Calculates the number of
triplets in a dataset in which object property is used to
create relationships between different concept’s instances
RU(P) = | { t:=(s,p,o) | p= P} |
>Attribute Usage (RU): Measures how much data description
is available in the knowledge base for a concept instance
AU(A) = | { t:=(s,p,o) | p A, o L) |
13. Structural properties
Represent ontology usage as a bipartite network
-Hidden properties in ontology usage network to identify
cohesive groups and measure semanticity.
-Study structural properties such as centrality, reciprocity,
density and reachability
Capture the knowledge patterns
-Schema level patterns Hidden properties in ontology usage
network to identify cohesive groups and measure
semanticity.
-Study structural properties such as centrality, reciprocity,
density and reachability
15. Initial Results – use case
Web Schema construction based on Ontology Usage Analysis
Domain : eCommerce
Dataset : 305 data sources (pay-level domains published ecommerce data)
Ranking the terms
16. U Ontology
Ontology Usage Ontology (U Ontology)
Goal : Capture the detail of ontologies and their usage
Use cases :
- publish the ontology usage details on the web.
- generate prototypical SPARQL queries
Reusing existing ontologies
-Ontology Metadata Vocabulary (OMV) [1]
-Ontology Application Framework (OAF) [2]
-FOAF, DC
[1] Hartmann, J., Palma, R., Sure, Y., Suárez-Figueroa, M.C., Haase P.: OMV– Ontology Metadata Vocabulary. In: The
Ontology Patterns for the Semantic Web (OPSW) Workshop at ISWC 2005, Galway, Ireland (2005)
[2] http://ontolog.cim3.net/file/work/OntologySummit2011/ApplicationFramework/OWL-Ontology/BenefitsAndTechniques-
WithDocumentation.pdf
17. Conclusion
What and how Semantic Web data
Web (Linked data cloud) Structured
ontologies are being
used on the web?
Ontology Usage Catalogue
(Michael Uschold) attribute: http://richard.cyganiak.de/2007/10/lod
http://www.cs.vu.nl/~frankh/spool/ISWC2011Keynote/
18. Future work
• Build industry specific datasets to understand the ontology
usage, data and knowledge patterns.
• Automate the population of U Ontology
• Publication of Ontology Usage catalogue
• Recommendations to publishers and vocabulary designers
Exploratory data analysis: s an approach to analyzing data sets to summarize their main characteristics in easy-to-understand form, often with visual graphs, without using a statistical model or having formulated a hypothesis
What are we trying to achieve in this research? We have seen tremendous growth in the semantic web data (web-of-data) on the web. As a result of it now we have “structured data” on the web in the form of RDF, enabling “ machines ” to automatically understand the data and process it. Now, we have reached to the point where, the availability of semantic data on the web is enabling the possibility of conducting imperial analysis about the data, use of ontologies .