With the adoption of RDF across several domains come growing requirements pertaining to the completeness and quality of RDF datasets. Currently, this problem is most commonly addressed by manually devising means to enriching an input dataset. The few tools that aim at supporting this endeavour usually focus on supporting the manual definition of enrichment pipelines. In this paper, we present a supervised learning approach based on a refinement operator for enriching RDF datasets. We show how we can use exemplary descriptions of enriched resources to generate accurate enrichment pipelines. We evaluate our approach against 8 manually defined enrichment pipelines and show that our approach can learn accurate pipelines even when provided with a small number of training examples.
See how organizations across Europe are using FME for success with INSPIRE. In this webinar, you'll see:
- Simplifying INSPIRE schema mapping at the German Federal Agency for Nature Conservation
- Validating environmental monitoring data at the Swedish EPA
- Reading INSPIRE land registry data by several organizations in the UK
A Natural Language Processing Approach to Reviewing Research AbstractsRobert Songer
Research literature reviews have largely moved online and researchers must search through large quantities of digital documents to find research related to their academic pursuits. With recent developments in Natural Language Processing (NLP), computers can perform most of the searching and reduce the amount of time it takes researchers to find the papers they need. In this report, we introduce three basic NLP techniques (tokenization, frequency distributions, and in-sentence collocations) for searching the written texts of research abstracts downloaded from an online database. Real examples written in the Python programming language are provided along with a discussion of their efficacy in a project at Kanazawa University where an online research database was searched for research related to the adverse effects of hundreds of pharmaceutical compounds.
See how organizations across Europe are using FME for success with INSPIRE. In this webinar, you'll see:
- Simplifying INSPIRE schema mapping at the German Federal Agency for Nature Conservation
- Validating environmental monitoring data at the Swedish EPA
- Reading INSPIRE land registry data by several organizations in the UK
A Natural Language Processing Approach to Reviewing Research AbstractsRobert Songer
Research literature reviews have largely moved online and researchers must search through large quantities of digital documents to find research related to their academic pursuits. With recent developments in Natural Language Processing (NLP), computers can perform most of the searching and reduce the amount of time it takes researchers to find the papers they need. In this report, we introduce three basic NLP techniques (tokenization, frequency distributions, and in-sentence collocations) for searching the written texts of research abstracts downloaded from an online database. Real examples written in the Python programming language are provided along with a discussion of their efficacy in a project at Kanazawa University where an online research database was searched for research related to the adverse effects of hundreds of pharmaceutical compounds.
Partners Mauricio Uribe, Vlad Teplitskiy and Harnik Shukla gave an informative presentation on protecting artificial intelligence (AI)/machine learning (ML) inventions in the United States. The presentation covered understanding AI/ML technology and related applications, patent prosecution trends, and recognizing intellectual property issues and developing protection strategies for AI/ML technologies.
Speakers: Mauricio Uribe, Vlad Teplitskiy and Harnik Shukla
Alex Tellez's slides on Deep Learning Applications, including using auto-encoders, finding better Bordeaux wine, and fighting crime in Chicago, from the 3/11/15 Meetup at H2O.ai HQ and the 3/12/15 Meetup at Mills College.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Advanced Spreadsheet Skill By: Darwin B. LopeJohnjosfirRoca
Advanced Spreadsheet Skill By: Darwin B. Lope
Reacreted by: Johnjosfir B. Roca
(We do not own the script in this presentation, it is based from the presentation of Darwin B. Lope. We do not have the intention to steal this presentation from the owner. Thank you)
Lessons Learned: Guidance based on Early Experiences of Implementing ISO 5000...Arantico Ltd
Talk presented at: IETC 2014, .Industrial Energy Technology Conference, New Orleans. Author: Paul F Monaghan PhD, CEO Enerit Ltd, paul.monaghan@enerit.com, http://ie.linkedin.com/in/paulfmonaghan
This paper draws on real experience of implementing ISO 50001 and Superior Energy Performance (SEP) in USA and Europe in a variety of sectors: pharmaceutical, automotive, medical device, metals and plastics, universities & ICT. The paper aims to give high-level practical guidance on implementation of these types of energy management system (EnMS)
US DoE has reported substantial extra savings with an EnMS approach. However, there are a number of challenges to: getting started; and making continuous improvement with EnMS.
Firstly, the expression “Energy Management System” means different things to different people. In this paper, we explain what the difference is between EnMS, as meant by ISO 50001/SEP and the general class of Energy Management Information Systems (EMIS), which include monitoring systems and corporate energy/carbon reporting systems.
The next problem is “getting started” with EnMS. In this paper, we first explain identify that there are two key reasons for people to adopt EnMS: These two organizations may start in a different way:
• If the goal is “I want to save more energy in a cost-effective way”, start with an energy review and find ways to visualize how energy is used within the organization.
• If the goal is “I want to get an ISO 50001/SEP certified”, start with a gap assessment.
When some sites in the organization have successfully implemented an EnMS, how do you: ensure that those sites continuously improve; and how do you bring the less advanced sites up to the level of the leading sites?
We believe it is useful to view everything in terms of Energy Management Maturity Models and use this as guidance along the path from start-up through to embedded continuous energy management improvement. Factors that we have found useful in successful rollouts are visualization tools e.g.: Sankey diagrams - show energy flows; Spider diagrams; Dashboards.
In this paper, we will show examples of these visualizations in real situations.
Biography: Paul F Monaghan has a Ph.D. in Mechanical Engineering from Queen’s University Canada. After an early career as an energy engineering consultant, he became a tenured Professor in Mechanical Engineering at the National University of Ireland. He left to found two start-up software companies, QSET and Enerit and lived in USA & Ireland. His focus has been on energy and software for over 30 years. Today, he leads Enerit which has delivered ISO 50001 software in Europe, Americas and Asia to organizations like: Pfizer, Medimmune, Fiat, Boston Scientific, Nuqul, Harbec, Sage and Asia Development Bank.
Games to Improve Clinical Practice and Healthcare AdministrationSeriousGamesAssoc
Leadership at all levels of a healthcare organization plays a central role in patient safety, yet few evidence-based interventions exist to meet this critical function. Simulation and gaming have demonstrated improvement in technical and non-technical competencies of healthcare workers, as well as organizational learning and continuous improvement but has not been broadly applied to the patient safety.
Our research developed, implemented and evaluated a gaming application for building safety related leadership competencies along with strategy development for executive and mid-level healthcare leaders.
Specifically, the work addressed two broad questions: 1) is a gaming application more effective than traditional methods of instruction for improving patient safety leadership competencies, and 2) what makes gaming most effective as a strategy generation tool for patient safety leadership?
Our project assembled a multi-disciplinary team (simulation, training, gaming, engineering, patient safety leadership, business and management, social science of creativity, human factors and organizational psychology) to design, implement, and evaluate patient safety leadership development. It evaluates the impact of two practically relevant implementation factors: 1) team familiarity of participants, and 2) a mindfulness intervention designed to boost learning efficiency.
The impact of this work is broad, given a lack of existing games for leaders and the proven effectiveness of gaming in other complex skill domains.
At Elsevier, a lot of effort is focussed on content discovery for users, allowing them to find the most relevant articles for their research. This, at its core, blurs the boundaries of search and recommendation as we are both pushing content to the user and allowing them to search the world’s largest catalogue of scientific research. Apart from using the content as is, we can make new content more discoverable with the help of authors at submission time, for example by getting them to write an executive summary of their paper. However, doing this at submission time means that this additional information is not available for older content. This raises the question of how we can utilise the author’s input on new content to create the same feature retrospectively to the whole Elsevier corpus. Focusing on one use case, we discuss how an extractive summarization model (which is trained on the user-submitted summaries), is used to retrospectively generate executive summaries for articles in the catalogue. Further, we show how extractive summarization is used to highlight the salient points (methods, results and finding) within research articles across the complete corpus. This helps users to identify whether an article is of particular interest for them. As a logical next step, we investigate how these extractions can be used to make the research papers more discoverable through connecting it to other papers which share similar findings, methods or conclusion. In this talk we start from the beginning, understanding what users want from summarization systems. We discuss how the proposed use cases were developed and how this ties into the discovery of new content. We then look in more technical detail at what data is available and which methods can be utilised to implement such a system. Finally, while we are working toward taking this extractive summarization system into production, we need to understand the quality of what is being produced before going live. We discuss how internal annotators were used to confirming the quality of the summaries. Though the monitoring of quality does not stop there, we continually monitor user interaction with the extractive summaries as a proxy for quality and satisfaction.
An Approach to Detecting Writing Styles Based on Clustering Techniquesambekarshweta25
An Approach to Detecting Writing Styles Based on Clustering Techniques
Authors:
-Devkinandan Jagtap
-Shweta Ambekar
-Harshit Singh
-Nakul Sharma (Assistant Professor)
Institution:
VIIT Pune, India
Abstract:
This paper proposes a system to differentiate between human-generated and AI-generated texts using stylometric analysis. The system analyzes text files and classifies writing styles by employing various clustering algorithms, such as k-means, k-means++, hierarchical, and DBSCAN. The effectiveness of these algorithms is measured using silhouette scores. The system successfully identifies distinct writing styles within documents, demonstrating its potential for plagiarism detection.
Introduction:
Stylometry, the study of linguistic and structural features in texts, is used for tasks like plagiarism detection, genre separation, and author verification. This paper leverages stylometric analysis to identify different writing styles and improve plagiarism detection methods.
Methodology:
The system includes data collection, preprocessing, feature extraction, dimensional reduction, machine learning models for clustering, and performance comparison using silhouette scores. Feature extraction focuses on lexical features, vocabulary richness, and readability scores. The study uses a small dataset of texts from various authors and employs algorithms like k-means, k-means++, hierarchical clustering, and DBSCAN for clustering.
Results:
Experiments show that the system effectively identifies writing styles, with silhouette scores indicating reasonable to strong clustering when k=2. As the number of clusters increases, the silhouette scores decrease, indicating a drop in accuracy. K-means and k-means++ perform similarly, while hierarchical clustering is less optimized.
Conclusion and Future Work:
The system works well for distinguishing writing styles with two clusters but becomes less accurate as the number of clusters increases. Future research could focus on adding more parameters and optimizing the methodology to improve accuracy with higher cluster values. This system can enhance existing plagiarism detection tools, especially in academic settings.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Partners Mauricio Uribe, Vlad Teplitskiy and Harnik Shukla gave an informative presentation on protecting artificial intelligence (AI)/machine learning (ML) inventions in the United States. The presentation covered understanding AI/ML technology and related applications, patent prosecution trends, and recognizing intellectual property issues and developing protection strategies for AI/ML technologies.
Speakers: Mauricio Uribe, Vlad Teplitskiy and Harnik Shukla
Alex Tellez's slides on Deep Learning Applications, including using auto-encoders, finding better Bordeaux wine, and fighting crime in Chicago, from the 3/11/15 Meetup at H2O.ai HQ and the 3/12/15 Meetup at Mills College.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Advanced Spreadsheet Skill By: Darwin B. LopeJohnjosfirRoca
Advanced Spreadsheet Skill By: Darwin B. Lope
Reacreted by: Johnjosfir B. Roca
(We do not own the script in this presentation, it is based from the presentation of Darwin B. Lope. We do not have the intention to steal this presentation from the owner. Thank you)
Lessons Learned: Guidance based on Early Experiences of Implementing ISO 5000...Arantico Ltd
Talk presented at: IETC 2014, .Industrial Energy Technology Conference, New Orleans. Author: Paul F Monaghan PhD, CEO Enerit Ltd, paul.monaghan@enerit.com, http://ie.linkedin.com/in/paulfmonaghan
This paper draws on real experience of implementing ISO 50001 and Superior Energy Performance (SEP) in USA and Europe in a variety of sectors: pharmaceutical, automotive, medical device, metals and plastics, universities & ICT. The paper aims to give high-level practical guidance on implementation of these types of energy management system (EnMS)
US DoE has reported substantial extra savings with an EnMS approach. However, there are a number of challenges to: getting started; and making continuous improvement with EnMS.
Firstly, the expression “Energy Management System” means different things to different people. In this paper, we explain what the difference is between EnMS, as meant by ISO 50001/SEP and the general class of Energy Management Information Systems (EMIS), which include monitoring systems and corporate energy/carbon reporting systems.
The next problem is “getting started” with EnMS. In this paper, we first explain identify that there are two key reasons for people to adopt EnMS: These two organizations may start in a different way:
• If the goal is “I want to save more energy in a cost-effective way”, start with an energy review and find ways to visualize how energy is used within the organization.
• If the goal is “I want to get an ISO 50001/SEP certified”, start with a gap assessment.
When some sites in the organization have successfully implemented an EnMS, how do you: ensure that those sites continuously improve; and how do you bring the less advanced sites up to the level of the leading sites?
We believe it is useful to view everything in terms of Energy Management Maturity Models and use this as guidance along the path from start-up through to embedded continuous energy management improvement. Factors that we have found useful in successful rollouts are visualization tools e.g.: Sankey diagrams - show energy flows; Spider diagrams; Dashboards.
In this paper, we will show examples of these visualizations in real situations.
Biography: Paul F Monaghan has a Ph.D. in Mechanical Engineering from Queen’s University Canada. After an early career as an energy engineering consultant, he became a tenured Professor in Mechanical Engineering at the National University of Ireland. He left to found two start-up software companies, QSET and Enerit and lived in USA & Ireland. His focus has been on energy and software for over 30 years. Today, he leads Enerit which has delivered ISO 50001 software in Europe, Americas and Asia to organizations like: Pfizer, Medimmune, Fiat, Boston Scientific, Nuqul, Harbec, Sage and Asia Development Bank.
Games to Improve Clinical Practice and Healthcare AdministrationSeriousGamesAssoc
Leadership at all levels of a healthcare organization plays a central role in patient safety, yet few evidence-based interventions exist to meet this critical function. Simulation and gaming have demonstrated improvement in technical and non-technical competencies of healthcare workers, as well as organizational learning and continuous improvement but has not been broadly applied to the patient safety.
Our research developed, implemented and evaluated a gaming application for building safety related leadership competencies along with strategy development for executive and mid-level healthcare leaders.
Specifically, the work addressed two broad questions: 1) is a gaming application more effective than traditional methods of instruction for improving patient safety leadership competencies, and 2) what makes gaming most effective as a strategy generation tool for patient safety leadership?
Our project assembled a multi-disciplinary team (simulation, training, gaming, engineering, patient safety leadership, business and management, social science of creativity, human factors and organizational psychology) to design, implement, and evaluate patient safety leadership development. It evaluates the impact of two practically relevant implementation factors: 1) team familiarity of participants, and 2) a mindfulness intervention designed to boost learning efficiency.
The impact of this work is broad, given a lack of existing games for leaders and the proven effectiveness of gaming in other complex skill domains.
At Elsevier, a lot of effort is focussed on content discovery for users, allowing them to find the most relevant articles for their research. This, at its core, blurs the boundaries of search and recommendation as we are both pushing content to the user and allowing them to search the world’s largest catalogue of scientific research. Apart from using the content as is, we can make new content more discoverable with the help of authors at submission time, for example by getting them to write an executive summary of their paper. However, doing this at submission time means that this additional information is not available for older content. This raises the question of how we can utilise the author’s input on new content to create the same feature retrospectively to the whole Elsevier corpus. Focusing on one use case, we discuss how an extractive summarization model (which is trained on the user-submitted summaries), is used to retrospectively generate executive summaries for articles in the catalogue. Further, we show how extractive summarization is used to highlight the salient points (methods, results and finding) within research articles across the complete corpus. This helps users to identify whether an article is of particular interest for them. As a logical next step, we investigate how these extractions can be used to make the research papers more discoverable through connecting it to other papers which share similar findings, methods or conclusion. In this talk we start from the beginning, understanding what users want from summarization systems. We discuss how the proposed use cases were developed and how this ties into the discovery of new content. We then look in more technical detail at what data is available and which methods can be utilised to implement such a system. Finally, while we are working toward taking this extractive summarization system into production, we need to understand the quality of what is being produced before going live. We discuss how internal annotators were used to confirming the quality of the summaries. Though the monitoring of quality does not stop there, we continually monitor user interaction with the extractive summaries as a proxy for quality and satisfaction.
An Approach to Detecting Writing Styles Based on Clustering Techniquesambekarshweta25
An Approach to Detecting Writing Styles Based on Clustering Techniques
Authors:
-Devkinandan Jagtap
-Shweta Ambekar
-Harshit Singh
-Nakul Sharma (Assistant Professor)
Institution:
VIIT Pune, India
Abstract:
This paper proposes a system to differentiate between human-generated and AI-generated texts using stylometric analysis. The system analyzes text files and classifies writing styles by employing various clustering algorithms, such as k-means, k-means++, hierarchical, and DBSCAN. The effectiveness of these algorithms is measured using silhouette scores. The system successfully identifies distinct writing styles within documents, demonstrating its potential for plagiarism detection.
Introduction:
Stylometry, the study of linguistic and structural features in texts, is used for tasks like plagiarism detection, genre separation, and author verification. This paper leverages stylometric analysis to identify different writing styles and improve plagiarism detection methods.
Methodology:
The system includes data collection, preprocessing, feature extraction, dimensional reduction, machine learning models for clustering, and performance comparison using silhouette scores. Feature extraction focuses on lexical features, vocabulary richness, and readability scores. The study uses a small dataset of texts from various authors and employs algorithms like k-means, k-means++, hierarchical clustering, and DBSCAN for clustering.
Results:
Experiments show that the system effectively identifies writing styles, with silhouette scores indicating reasonable to strong clustering when k=2. As the number of clusters increases, the silhouette scores decrease, indicating a drop in accuracy. K-means and k-means++ perform similarly, while hierarchical clustering is less optimized.
Conclusion and Future Work:
The system works well for distinguishing writing styles with two clusters but becomes less accurate as the number of clusters increases. Future research could focus on adding more parameters and optimizing the methodology to improve accuracy with higher cluster values. This system can enhance existing plagiarism detection tools, especially in academic settings.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
DEER - Automating RDF Dataset Transformation and Enrichment
1. Motivation Approach Evaluation Conclusion and Future Work
DEER
Automating RDF Dataset Transformation and Enrichment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and
Jens Lehmann
June 3, 2015
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 1/26
2. Motivation Approach Evaluation Conclusion and Future Work
Outline
1 Motivation
2 Approach
3 Evaluation
4 Conclusion and Future Work
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 2/26
3. Motivation Approach Evaluation Conclusion and Future Work
Outline
1 Motivation
2 Approach
3 Evaluation
4 Conclusion and Future Work
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 3/26
4. Motivation Approach Evaluation Conclusion and Future Work
Why RDF Transformation & Enrichment?
Dataset DrugBank
Goal Gather information about companies related to drugs for
a market study
:Aspirin
:Paracetamol
:Ibuprofen
:Quinine
:Drug
a
a
a
a
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 4/26
5. Motivation Approach Evaluation Conclusion and Future Work
Why RDF Transformation & Enrichment?
Dataset DrugBank
Goal Gather information about companies related to drugs for
a market study
:Aspirin
:Paracetamol
:Ibuprofen
:Quinine
db:Ibuprofen
db:Aspirin
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
a
a
a
a
owl:sameAs
owl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 4/26
6. Motivation Approach Evaluation Conclusion and Future Work
RDF Transformation & Enrichment
Need for enriched datasets
Tourism
Question Answering
Enhanced Reality
...
RDF transformation and enrichment
Triples to be added to the original
KB and/or
Triples to be deleted from the
original KB
:Aspirin
:Paracetamol
:Ibuprofen
:Quinine
:Drug
a
a
a
a
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 5/26
7. Motivation Approach Evaluation Conclusion and Future Work
Manual Knowledge Base Enrichment
Demands for the specification of data enrichment
pipelines
Describe how data is to be integrated (usually manually)
Manual customized enrichment pipelines
⊕ Leads to the expected results
Time consuming
Cannot be ported easily to other datasets
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 6/26
8. Motivation Approach Evaluation Conclusion and Future Work
Manual Knowledge Base Enrichment
Demands for the specification of data enrichment
pipelines
Describe how data is to be integrated (usually manually)
Manual customized enrichment pipelines
⊕ Leads to the expected results
Time consuming
Cannot be ported easily to other datasets
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 6/26
9. Motivation Approach Evaluation Conclusion and Future Work
Automatic Knowledge Base Enrichment
Enrichment pipeline M : K → K that maps KB K to an
enriched KB K with K = M(K).
M is an ordered list of atomic enrichment functions
m ∈ M
M =
φ if K = K ,
(m1, . . . , mn), where mi ∈ M, 1 ≤ i ≤ n otherwise.
Research questions
1 How to create self-configuring atomic enrichment
functions m ∈ M?
2 How to automatically generate an enrichment pipeline M?
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 7/26
10. Motivation Approach Evaluation Conclusion and Future Work
Automatic Knowledge Base Enrichment
Enrichment pipeline M : K → K that maps KB K to an
enriched KB K with K = M(K).
M is an ordered list of atomic enrichment functions
m ∈ M
M =
φ if K = K ,
(m1, . . . , mn), where mi ∈ M, 1 ≤ i ≤ n otherwise.
Research questions
1 How to create self-configuring atomic enrichment
functions m ∈ M?
2 How to automatically generate an enrichment pipeline M?
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 7/26
11. Motivation Approach Evaluation Conclusion and Future Work
Outline
1 Motivation
2 Approach
3 Evaluation
4 Conclusion and Future Work
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 8/26
12. Motivation Approach Evaluation Conclusion and Future Work
Atomic Enrichment Functions
I. Dereferencing atomic enrichment function
Datasets are linked (e.g., using owl:sameAs)
Deferences pre-specified set of predicates
Adds found predicates to source the dataset
:Aspirin
:Paracetamol
:Ibuprofen
:Quinine
:Drug
a
a
a
a
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 9/26
13. Motivation Approach Evaluation Conclusion and Future Work
Atomic Enrichment Functions
I. Dereferencing atomic enrichment function
Datasets are linked (e.g., using owl:sameAs)
Deferences pre-specified set of predicates
Adds found predicates to source the dataset
:Aspirin
:Paracetamol
:Ibuprofen
:Quinine
db:Ibuprofen
db:Aspirin
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
a
a
a
a
owl:sameAs
owl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 9/26
14. Motivation Approach Evaluation Conclusion and Future Work
Atomic Enrichment Functions
I. Dereferencing atomic enrichment function
Datasets are linked (e.g., using owl:sameAs)
Deferences pre-specified set of predicates
Adds found predicates to source the dataset
:Aspirin
:Paracetamol
:Ibuprofen
:Quinine
db:Ibuprofen
db:Aspirin
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
a
a
a
a
owl:sameAs
owl:sameAs
rdfs:commentrdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 9/26
15. Motivation Approach Evaluation Conclusion and Future Work
Atomic Enrichment Functions
I. Dereferencing atomic enrichment function
Datasets are linked (e.g., using owl:sameAs)
Deferences pre-specified set of predicates
Adds found predicates to source the dataset
:Aspirin
:Paracetamol
:Ibuprofen
:Quinine
db:Ibuprofen
db:Aspirin
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
a
a
a
a
owl:sameAs
owl:sameAs
rdfs:commentrdfs:comment
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 9/26
16. Motivation Approach Evaluation Conclusion and Future Work
Self-Configuration
I. Dereferencing Enrichment Functions
Finds the set of predicates Dp from the enriched CBDs
that are missing from source CBDs
Non-enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen :Drugaowl:sameAs
Enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
:relatedCompany
owl:sameAs
rdfs:comment
Dp = {:relatedCompany, rdfs:comment}
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 10/26
17. Motivation Approach Evaluation Conclusion and Future Work
Self-Configuration
I. Dereferencing Enrichment Functions
Finds the set of predicates Dp from the enriched CBDs
that are missing from source CBDs
Non-enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen :Drugaowl:sameAs
Enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
:relatedCompany
owl:sameAs
rdfs:comment
Dp = {:relatedCompany, rdfs:comment}
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 10/26
18. Motivation Approach Evaluation Conclusion and Future Work
Self-Configuration
I. Dereferencing Enrichment Functions
Finds the set of predicates Dp from the enriched CBDs
that are missing from source CBDs
Non-enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen :Drugaowl:sameAs
Enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
:relatedCompany
owl:sameAs
rdfs:comment
Dp = {:relatedCompany, rdfs:comment}
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 10/26
19. Motivation Approach Evaluation Conclusion and Future Work
Self-Configuration
I. Dereferencing Enrichment Functions
Finds the set of predicates Dp from the enriched CBDs
that are missing from source CBDs
Non-enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen :Drugaowl:sameAs
Enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
:relatedCompany
owl:sameAs
rdfs:comment
Dp = {:relatedCompany, rdfs:comment}
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 10/26
20. Motivation Approach Evaluation Conclusion and Future Work
Self-Configuration
I. Dereferencing Enrichment Functions
Dereferences Dp = {:relatedCompany, rdfs:comment}
CBD of Ibuprofen
:Aspirin
:Paracetamol
:Ibuprofen
:Quinine
db:Ibuprofen
db:Aspirin
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
a
a
a
a
owl:sameAs
owl:sameAs
rdfs:comment
Finds only rdfs:comment, adds it to the source dataset
Dereferencing enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drugaowl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 11/26
21. Motivation Approach Evaluation Conclusion and Future Work
Self-Configuration
I. Dereferencing Enrichment Functions
Dereferences Dp = {:relatedCompany, rdfs:comment}
CBD of Ibuprofen
:Aspirin
:Paracetamol
:Ibuprofen
:Quinine
db:Ibuprofen
db:Aspirin
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
a
a
a
a
owl:sameAs
owl:sameAs
rdfs:comment
Finds only rdfs:comment, adds it to the source dataset
Dereferencing enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drugaowl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 11/26
22. Motivation Approach Evaluation Conclusion and Future Work
Self-Configuration
I. Dereferencing Enrichment Functions
Dereferences Dp = {:relatedCompany, rdfs:comment}
CBD of Ibuprofen
:Aspirin
:Paracetamol
:Ibuprofen
:Quinine
db:Ibuprofen
db:Aspirin
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
a
a
a
a
owl:sameAs
owl:sameAs
rdfs:comment
Finds only rdfs:comment, adds it to the source dataset
Dereferencing enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drugaowl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 11/26
23. Motivation Approach Evaluation Conclusion and Future Work
Self-Configuration
I. Dereferencing Enrichment Functions
Dereferences Dp = {:relatedCompany, rdfs:comment}
CBD of Ibuprofen
:Aspirin
:Paracetamol
:Ibuprofen
:Quinine
db:Ibuprofen
db:Aspirin
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
a
a
a
a
owl:sameAs
owl:sameAs
rdfs:comment
Finds only rdfs:comment, adds it to the source dataset
Dereferencing enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drugaowl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 11/26
24. Motivation Approach Evaluation Conclusion and Future Work
Atomic Enrichment Functions
II. NLP atomic enrichment function
Datatype objects contain unstructured information
Uses Named Entity Recognition to extract implicit data
Adds extracted entities to the source datasets
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drugaowl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 12/26
25. Motivation Approach Evaluation Conclusion and Future Work
Atomic Enrichment Functions
II. NLP atomic enrichment function
Datatype objects contain unstructured information
Uses Named Entity Recognition to extract implicit data
Adds extracted entities to the source datasets
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drugaowl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 12/26
26. Motivation Approach Evaluation Conclusion and Future Work
Atomic Enrichment Functions
II. NLP atomic enrichment function
Datatype objects contain unstructured information
Uses Named Entity Recognition to extract implicit data
Adds extracted entities to the source datasets
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drugaowl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 12/26
27. Motivation Approach Evaluation Conclusion and Future Work
Atomic Enrichment Functions
II. NLP atomic enrichment function
Datatype objects contain unstructured information
Uses Named Entity Recognition to extract implicit data
Adds extracted entities to the source datasets
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
fox:relatedTo
owl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 12/26
28. Motivation Approach Evaluation Conclusion and Future Work
Self-Configuration
II. NLP Enrichment Function
Extracts all possible named entity types
Adds extracted entities to the source dataset
NLP enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drugaowl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 13/26
29. Motivation Approach Evaluation Conclusion and Future Work
Self-Configuration
II. NLP Enrichment Function
Extracts all possible named entity types
Adds extracted entities to the source dataset
NLP enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drugaowl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 13/26
30. Motivation Approach Evaluation Conclusion and Future Work
Self-Configuration
II. NLP Enrichment Function
Extracts all possible named entity types
Adds extracted entities to the source dataset
NLP enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
fox:relatedTo
owl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 13/26
31. Motivation Approach Evaluation Conclusion and Future Work
Atomic Enrichment Functions
III. Predicate conformation atomic enrichment function
Enriched datasets may contain diverse ontologies
Predicate conformation maps a set of a pre-specified
predicates to a target ontology
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
fox:relatedTo
owl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 14/26
32. Motivation Approach Evaluation Conclusion and Future Work
Atomic Enrichment Functions
III. Predicate conformation atomic enrichment function
Enriched datasets may contain diverse ontologies
Predicate conformation maps a set of a pre-specified
predicates to a target ontology
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
fox:relatedTo
owl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 14/26
33. Motivation Approach Evaluation Conclusion and Future Work
Atomic Enrichment Functions
III. Predicate conformation atomic enrichment function
Enriched datasets may contain diverse ontologies
Predicate conformation maps a set of a pre-specified
predicates to a target ontology
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
fox:relatedTo:relatedCompany
owl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 14/26
34. Motivation Approach Evaluation Conclusion and Future Work
Self-Configuration
III. Predicate conformation Enrichment Function
Finds list of predicates Ps and Pt from the source resp.
target datasets with the same subject and objects
Changes each Ps with its respective Pt
NLP enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
fox:relatedTo
owl:sameAs
rdfs:comment
Enriched CBD of Ibuprofen (positive example target)
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
:relatedCompany
owl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 15/26
35. Motivation Approach Evaluation Conclusion and Future Work
Self-Configuration
III. Predicate conformation Enrichment Function
Finds list of predicates Ps and Pt from the source resp.
target datasets with the same subject and objects
Changes each Ps with its respective Pt
NLP enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
fox:relatedTo
owl:sameAs
rdfs:comment
Enriched CBD of Ibuprofen (positive example target)
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
:relatedCompany
owl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 15/26
36. Motivation Approach Evaluation Conclusion and Future Work
Self-Configuration
III. Predicate conformation Enrichment Function
Finds list of predicates Ps and Pt from the source resp.
target datasets with the same subject and objects
Changes each Ps with its respective Pt
NLP enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
fox:relatedTo:relatedCompany
owl:sameAs
rdfs:comment
Enriched CBD of Ibuprofen (positive example target)
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
:relatedCompany
owl:sameAs
rdfs:comment
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 15/26
37. Motivation Approach Evaluation Conclusion and Future Work
KB Enrichment Refinement Operator
Input
Set of atomic enrichment functions M
Set of positive examples E
Refinement Operator
ρ(M) =
∀m∈M
M ++ m ( ++ is the list append operator)
Output
Enrichment pipeline M
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 16/26
38. Motivation Approach Evaluation Conclusion and Future Work
KB Enrichment Refinement Operator
Input
Set of atomic enrichment functions M
Set of positive examples E
Refinement Operator
ρ(M) =
∀m∈M
M ++ m ( ++ is the list append operator)
Output
Enrichment pipeline M
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 16/26
39. Motivation Approach Evaluation Conclusion and Future Work
KB Enrichment Refinement Operator
Input
Set of atomic enrichment functions M
Set of positive examples E
Refinement Operator
ρ(M) =
∀m∈M
M ++ m ( ++ is the list append operator)
Output
Enrichment pipeline M
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 16/26
40. Motivation Approach Evaluation Conclusion and Future Work
Positive Example
:Ibuprofendb:Ibuprofen :Drugaowl:sameAs
Non-enriched CBD of Ibuprofen
:Ibuprofendb:Ibuprofen
Ibuprofen was extracted by the research arm
of Boots company during the 1960s ...
:Drug
:BootsCompany
a
:relatedCompany
owl:sameAs
rdfs:comment
Enriched CBD of Ibuprofen
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 17/26
41. Motivation Approach Evaluation Conclusion and Future Work
Learning Algorithm
1 Start by empty enrichment pipeline M = ⊥
2 Self-configure all mi ∈ M, add as child to ⊥
3 Select most promising node
4 Expand most promising node
⊥
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 18/26
42. Motivation Approach Evaluation Conclusion and Future Work
Learning Algorithm
1 Start by empty enrichment pipeline M = ⊥
2 Self-configure all mi ∈ M, add as child to ⊥
3 Select most promising node
4 Expand most promising node
⊥
(m1) (m2) (m3)
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 18/26
43. Motivation Approach Evaluation Conclusion and Future Work
Learning Algorithm
1 Start by empty enrichment pipeline M = ⊥
2 Self-configure all mi ∈ M, add as child to ⊥
3 Select most promising node
4 Expand most promising node
⊥
(m1) (m2) (m3)
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 18/26
44. Motivation Approach Evaluation Conclusion and Future Work
Learning Algorithm
1 Start by empty enrichment pipeline M = ⊥
2 Self-configure all mi ∈ M, add as child to ⊥
3 Select most promising node
4 Expand most promising node
⊥
(m1) (m2) (m3)
(m1, m2) (m1, m3)
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 18/26
45. Motivation Approach Evaluation Conclusion and Future Work
Learning Algorithm
1 Start by empty enrichment pipeline M = ⊥
2 Self-configure all mi ∈ M, add as child to ⊥
3 Select most promising node
4 Expand most promising node
⊥
(m1) (m2) (m3)
(m1, m2) (m1, m3)
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 18/26
46. Motivation Approach Evaluation Conclusion and Future Work
Learning Algorithm
1 Start by empty enrichment pipeline M = ⊥
2 Self-configure all mi ∈ M, add as child to ⊥
3 Select most promising node
4 Expand most promising node
⊥
(m1) (m2) (m3)
(m1, m2) (m1, m3) (m3, m1) (m3, m2)
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 18/26
47. Motivation Approach Evaluation Conclusion and Future Work
Learning Algorithm
1 Start by empty enrichment pipeline M = ⊥
2 Self-configure all mi ∈ M, add as child to ⊥
3 Select most promising node
4 Expand most promising node
⊥
(m1) (m2) (m3)
(m1, m2) (m1, m3) (m3, m1) (m3, m2)
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 18/26
48. Motivation Approach Evaluation Conclusion and Future Work
Learning Algorithm
1 Start by empty enrichment pipeline M = ⊥
2 Self-configure all mi ∈ M, add as child to ⊥
3 Select most promising node
4 Expand most promising node
⊥
(m1) (m2) (m3)
(m1, m2) (m1, m3) (m3, m1) (m3, m2)
(m3, m2, m1)
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 18/26
49. Motivation Approach Evaluation Conclusion and Future Work
Most Promising Node Selection
Node complexity c(n)
Linear combination of the node’s children count and level
Node fitness f (n)
Difference between node’s enrichment pipeline F-measure
and weighted complexity, f (n) = F(n) − ω.c(n)
ω controls the tradeoff between
Greedy search (ω = 0)
Search strategies closer to breadth-first search (ω > 0).
Most promising node
The leaf node with the maximum fitness through the
whole refinement tree
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 19/26
50. Motivation Approach Evaluation Conclusion and Future Work
Most Promising Node Selection
Node complexity c(n)
Linear combination of the node’s children count and level
Node fitness f (n)
Difference between node’s enrichment pipeline F-measure
and weighted complexity, f (n) = F(n) − ω.c(n)
ω controls the tradeoff between
Greedy search (ω = 0)
Search strategies closer to breadth-first search (ω > 0).
Most promising node
The leaf node with the maximum fitness through the
whole refinement tree
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 19/26
51. Motivation Approach Evaluation Conclusion and Future Work
Most Promising Node Selection
Node complexity c(n)
Linear combination of the node’s children count and level
Node fitness f (n)
Difference between node’s enrichment pipeline F-measure
and weighted complexity, f (n) = F(n) − ω.c(n)
ω controls the tradeoff between
Greedy search (ω = 0)
Search strategies closer to breadth-first search (ω > 0).
Most promising node
The leaf node with the maximum fitness through the
whole refinement tree
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 19/26
52. Motivation Approach Evaluation Conclusion and Future Work
Outline
1 Motivation
2 Approach
3 Evaluation
4 Conclusion and Future Work
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 20/26
53. Motivation Approach Evaluation Conclusion and Future Work
Experimental Setup
Datasets
1 manual experimental enrichment pipelines for Jamendo
2 manual experimental enrichment pipelines for DrugBank
5 manual experimental enrichment pipelines for DBpedia
(AdministrativeRegion)
Learning Algorithm
6 atomic enrichment functions
Termination criterion:
Maximum number of iterations of 10
Optimal enrichment pipeline found (F-score = 1)
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 21/26
54. Motivation Approach Evaluation Conclusion and Future Work
Experimental Setup
Datasets
1 manual experimental enrichment pipelines for Jamendo
2 manual experimental enrichment pipelines for DrugBank
5 manual experimental enrichment pipelines for DBpedia
(AdministrativeRegion)
Learning Algorithm
6 atomic enrichment functions
Termination criterion:
Maximum number of iterations of 10
Optimal enrichment pipeline found (F-score = 1)
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 21/26
55. Motivation Approach Evaluation Conclusion and Future Work
Configuration of the Search Strategy
Node fitness
f (n) = F(n) − ω.c(n)
ω controls the tradeoff between
Greedy search (ω = 0)
Search strategies closer to
breadth first search (ω > 0).
Result: ω = 0.75 leads to the
best results
ω P R F
0 1.0 0.99 0.99
0.25 1.0 0.99 0.99
0.50 1.0 0.99 0.99
0.75 1.0 1.0 1.0
1.0 1.0 0.99 0.99
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 22/26
57. Motivation Approach Evaluation Conclusion and Future Work
Outline
1 Motivation
2 Approach
3 Evaluation
4 Conclusion and Future Work
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 24/26
58. Motivation Approach Evaluation Conclusion and Future Work
Conclusion and Future Work
Conclusion
Presented self-configuring atomic enrichment functions
Presented an approach for learning enrichment pipelines
based on a refinement operator
Showed that our approach can easily reconstruct
manually created enrichment pipelines
Future Work
Parallelize the algorithm on several CPUs as well as load
balancing
Support directed acyclic graphs as enrichment
specifications by allowing to split and merge datasets
Pro-active enrichment strategies and active learning
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 25/26
59. Motivation Approach Evaluation Conclusion and Future Work
Conclusion and Future Work
Conclusion
Presented self-configuring atomic enrichment functions
Presented an approach for learning enrichment pipelines
based on a refinement operator
Showed that our approach can easily reconstruct
manually created enrichment pipelines
Future Work
Parallelize the algorithm on several CPUs as well as load
balancing
Support directed acyclic graphs as enrichment
specifications by allowing to split and merge datasets
Pro-active enrichment strategies and active learning
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 25/26
60. Motivation Approach Evaluation Conclusion and Future Work
Thank You!
Questions?
Mohamed Sherif
Augustusplatz 10
D-04109 Leipzig
sherif@informatik.uni-leipzig.de
http://aksw.org/MohamedSherif
http://aksw.org/Projects/DEER
#akswgroup
Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo and Jens Lehmann — DEER 26/26