Tutorial presented at 2012 ACM SIGHIT International Health Informatics Symposium (IHI 2012), January 28-30, 2012. http://sites.google.com/site/web2011ihi/participants/tutorials
This tutorial weaves together three themes and the associated topics:
[1] The role of biomedical ontologies
[2] Key Semantic Web technologies with focus on Semantic provenance and integration
[3] In-practice tools and real world use cases built to serve the needs of sleep medicine researchers, cardiologists involved in clinical practice, and work on vaccine development for human pathogens.
The Biggest Barriers to Healthcare InteroperabilityHealth Catalyst
Improving healthcare interoperability is a top priority for health systems today. Fundamental problems around improving interoperability include standardization of terminology and normalization of data to those standards. And, the volume of data healthcare IT systems produce exacerbates these problems.
While interoperability regulations focus on trying to make it easy to find and exchange patient data across multiple organizations and HIEs, the legislation’s lack of fine print and aggressive implementation timelines nearly ensures the proliferation of existing interoperability problems. This article discusses the biggest barriers to interoperability, possible solutions to interoperability problems, and why it matters.
HL7
Health level 7
What is HL7?
What does it stand for
HL7 Mission
HL7 contains message standards
HL7 in HealthcareManagement System
Standards
Limitations of HL7
The Biggest Barriers to Healthcare InteroperabilityHealth Catalyst
Improving healthcare interoperability is a top priority for health systems today. Fundamental problems around improving interoperability include standardization of terminology and normalization of data to those standards. And, the volume of data healthcare IT systems produce exacerbates these problems.
While interoperability regulations focus on trying to make it easy to find and exchange patient data across multiple organizations and HIEs, the legislation’s lack of fine print and aggressive implementation timelines nearly ensures the proliferation of existing interoperability problems. This article discusses the biggest barriers to interoperability, possible solutions to interoperability problems, and why it matters.
HL7
Health level 7
What is HL7?
What does it stand for
HL7 Mission
HL7 contains message standards
HL7 in HealthcareManagement System
Standards
Limitations of HL7
The healthcare ecosystem is witnessing a huge transformation lately; propelled by improved care and patient outcomes as the critical drivers. Briefly put, organizations (providers, hospitals and all) are leveraging the potential of Internet of Things, to empower their people, patients to take control of their own health. In a subtle way, redefining the way people, sensors, apps, devices and wearables can interact with each other in a secure environment, and take the healthcare experience to the next level.
A recent survey by Forrester Consulting suggests 90% of the Healthcare IT departments are ready to adapt IoT based solutions. And, 52% of the surveyed respondents are already incorporating IoT technology.
With IoT as a powerful enabler, innovative apps and wearables are taking strong roots in the healthcare ecosystem; health bands, fitness devices, calorie meters, heart rate monitors, to name a few. Such healthcare devices are used by physicians to record patient’s biometric information as they deliver exceptional patient monitoring and management results on-the-go.
Pushing the edge of the contemporary cognitive (CHC) theory: New directions ...Kevin McGrew
This is the current version of my previous "Beyond CHC theory" module. It presents my current thinking [based on extensive exploratory and confirmatory analysis of multiple data sets (esp. the WJ III norm data and WJ III joint cross-battery data sets) plus the integration of contemporary cognitive, neurocognitive, intelligence and neuropsychological research] re: potential future evolutions of the Cattell-Horn-Carroll (CHC) model of human cognitive abilities. This current presentation was last presented at the CNN (neuropsych) conference the first week of October, 2010, in Fremantle Australia
Big Data and Artificial Intelligence in Critical Care
Anesthesia and Intensive Care
San Raffaele Hospital, Milan, Italy
Vita-Salute San Raffaele University, Milan, Italy
Follow us on Twitter, Facebook and Instagram @SRAnesthesiaICU
The Health Catalyst Data Operating System (DOS™): Lessons Learned and Plans ...Health Catalyst
Just over three years ago, Health Catalyst publicly announced the development of the Data Operating System (DOSTM). Conceptually, DOS goes back more than 20 years as a single platform that could support what Dale Sanders calls the “Three Missions of Data”—analytics, data-first application development, and interoperability.
“Data platforms are the next evolution of the technology stack,” Sanders says. While the Cloud made infrastructure an easy and scalable platform, modern operating systems and programming languages made software platforms scalable and easy to build. He cautions, however, “Data wrangling, especially in healthcare, is still a giant challenge.” Sanders explains that DOS is therefore an essential strategy for Health Catalyst, as well as an important new concept in the world of platforms.
“DOS and its concept is a data platform that makes analytics, app development, and interoperability easy and scalable,” Sanders says.
In this webinar, Sanders and Bryan Hinton will review the concept of a data operating system and the vision behind it. Hinton, who leads the DOS team for Health Catalyst, will reflect on lessons learned over the past three years and what he has planned for the future.
How are we Faring with FAIR? (and what FAIR is not)Carole Goble
Keynote presented at the workshop FAIRe Data Infrastructures, 15 October 2020
https://www.gmds.de/aktivitaeten/medizinische-informatik/projektgruppenseiten/faire-dateninfrastrukturen-fuer-die-biomedizinische-informatik/workshop-2020/
Remarkably it was only in 2016 that the ‘FAIR Guiding Principles for scientific data management and stewardship’ appeared in Scientific Data. The paper was intended to launch a dialogue within the research and policy communities: to start a journey to wider accessibility and reusability of data and prepare for automation-readiness by supporting findability, accessibility, interoperability and reusability for machines. Many of the authors (including myself) came from biomedical and associated communities. The paper succeeded in its aim, at least at the policy, enterprise and professional data infrastructure level. Whether FAIR has impacted the researcher at the bench or bedside is open to doubt. It certainly inspired a great deal of activity, many projects, a lot of positioning of interests and raised awareness. COVID has injected impetus and urgency to the FAIR cause (good) and also highlighted its politicisation (not so good).
In this talk I’ll make some personal reflections on how we are faring with FAIR: as one of the original principles authors; as a participant in many current FAIR initiatives (particularly in the biomedical sector and for research objects other than data) and as a veteran of FAIR before we had the principles.
Location-based Services (LBSs) are IT services for providing information that has been created, compiled, selected, or filtered taking into consideration the current locations of the users or those of other persons or mobile objects.
Global Internet of Medical Things (IoMT) Market is estimated to grow from USD 56.23 Billion in 2019 to reach USD 415.35 Billion by 2027, at a CAGR of 28.5 % during the forecast period from 2020-2027.
Ontology has its roots as a field of philosophical study that is focused on the nature of existence. However, today's ontology (aka knowledge graph) can incorporate computable descriptions that can bring insight in a wide set of compelling applications including more precise knowledge capture, semantic data integration, sophisticated query answering, and powerful association mining - thereby delivering key value for health care and the life sciences. In this webinar, I will introduce the idea of computable ontologies and describe how they can be used with automated reasoners to perform classification, to reveal inconsistencies, and to precisely answer questions. Participants will learn about the tools of the trade to design, find, and reuse ontologies. Finally, I will discuss applications of ontologies in the fields of diagnosis and drug discovery.
Bio:
Dr. Michel Dumontier is an Associate Professor of Medicine (Biomedical Informatics) at Stanford University. His research focuses on the development of methods to integrate, mine, and make sense of large, complex, and heterogeneous biological and biomedical data. His current research interests include (1) using genetic, proteomic, and phenotypic data to find new uses for existing drugs, (2) elucidating the mechanism of single and multi-drug side effects, and (3) finding and optimizing combination drug therapies. Dr. Dumontier is the Stanford University Advisory Committee Representative for the World Wide Web Consortium, the co-Chair for the W3C Semantic Web for Health Care and the Life Sciences Interest Group, scientific advisor for the EBI-EMBL Chemistry Services Division, and the Scientific Director for Bio2RDF, an open source project to create Linked Data for the Life Sciences. He is also the founder and Editor-in-Chief for a Data Science, a new IOS Press journal featuring open access, open review, and semantic publishing.
ERP software for Doctor. Complete management software for hospital.Jyotindra Zaveri
View this slideshow to learn more about this affordable ERP software. Includes complete accounting module. Case paper management. Multi user, multiple location system. ERP is a fully comprehensive software for Doctors, Hospitals, and Clinics including prescription writing and storing, Electronic case papers records and medical practice management. ERP for Dr., focuses on helping the doctor and automating tasks. It is very simple to use, customizable to your workflow and suitable for both Specialists and General Practitioners.
Meenakshi Nagarajan,Amit Sheth,Selvam Velmurugan, "Citizen Sensor Data Mining, Social Media Analytics and Development Centric Web Applications," Tutorial at WWW2011, Hyderabad, India, March 28, 2011.
More info at:
http://knoesis.org/library/resource.php?id=1030
http://www2011india.com/tutorialstr27.html
The healthcare ecosystem is witnessing a huge transformation lately; propelled by improved care and patient outcomes as the critical drivers. Briefly put, organizations (providers, hospitals and all) are leveraging the potential of Internet of Things, to empower their people, patients to take control of their own health. In a subtle way, redefining the way people, sensors, apps, devices and wearables can interact with each other in a secure environment, and take the healthcare experience to the next level.
A recent survey by Forrester Consulting suggests 90% of the Healthcare IT departments are ready to adapt IoT based solutions. And, 52% of the surveyed respondents are already incorporating IoT technology.
With IoT as a powerful enabler, innovative apps and wearables are taking strong roots in the healthcare ecosystem; health bands, fitness devices, calorie meters, heart rate monitors, to name a few. Such healthcare devices are used by physicians to record patient’s biometric information as they deliver exceptional patient monitoring and management results on-the-go.
Pushing the edge of the contemporary cognitive (CHC) theory: New directions ...Kevin McGrew
This is the current version of my previous "Beyond CHC theory" module. It presents my current thinking [based on extensive exploratory and confirmatory analysis of multiple data sets (esp. the WJ III norm data and WJ III joint cross-battery data sets) plus the integration of contemporary cognitive, neurocognitive, intelligence and neuropsychological research] re: potential future evolutions of the Cattell-Horn-Carroll (CHC) model of human cognitive abilities. This current presentation was last presented at the CNN (neuropsych) conference the first week of October, 2010, in Fremantle Australia
Big Data and Artificial Intelligence in Critical Care
Anesthesia and Intensive Care
San Raffaele Hospital, Milan, Italy
Vita-Salute San Raffaele University, Milan, Italy
Follow us on Twitter, Facebook and Instagram @SRAnesthesiaICU
The Health Catalyst Data Operating System (DOS™): Lessons Learned and Plans ...Health Catalyst
Just over three years ago, Health Catalyst publicly announced the development of the Data Operating System (DOSTM). Conceptually, DOS goes back more than 20 years as a single platform that could support what Dale Sanders calls the “Three Missions of Data”—analytics, data-first application development, and interoperability.
“Data platforms are the next evolution of the technology stack,” Sanders says. While the Cloud made infrastructure an easy and scalable platform, modern operating systems and programming languages made software platforms scalable and easy to build. He cautions, however, “Data wrangling, especially in healthcare, is still a giant challenge.” Sanders explains that DOS is therefore an essential strategy for Health Catalyst, as well as an important new concept in the world of platforms.
“DOS and its concept is a data platform that makes analytics, app development, and interoperability easy and scalable,” Sanders says.
In this webinar, Sanders and Bryan Hinton will review the concept of a data operating system and the vision behind it. Hinton, who leads the DOS team for Health Catalyst, will reflect on lessons learned over the past three years and what he has planned for the future.
How are we Faring with FAIR? (and what FAIR is not)Carole Goble
Keynote presented at the workshop FAIRe Data Infrastructures, 15 October 2020
https://www.gmds.de/aktivitaeten/medizinische-informatik/projektgruppenseiten/faire-dateninfrastrukturen-fuer-die-biomedizinische-informatik/workshop-2020/
Remarkably it was only in 2016 that the ‘FAIR Guiding Principles for scientific data management and stewardship’ appeared in Scientific Data. The paper was intended to launch a dialogue within the research and policy communities: to start a journey to wider accessibility and reusability of data and prepare for automation-readiness by supporting findability, accessibility, interoperability and reusability for machines. Many of the authors (including myself) came from biomedical and associated communities. The paper succeeded in its aim, at least at the policy, enterprise and professional data infrastructure level. Whether FAIR has impacted the researcher at the bench or bedside is open to doubt. It certainly inspired a great deal of activity, many projects, a lot of positioning of interests and raised awareness. COVID has injected impetus and urgency to the FAIR cause (good) and also highlighted its politicisation (not so good).
In this talk I’ll make some personal reflections on how we are faring with FAIR: as one of the original principles authors; as a participant in many current FAIR initiatives (particularly in the biomedical sector and for research objects other than data) and as a veteran of FAIR before we had the principles.
Location-based Services (LBSs) are IT services for providing information that has been created, compiled, selected, or filtered taking into consideration the current locations of the users or those of other persons or mobile objects.
Global Internet of Medical Things (IoMT) Market is estimated to grow from USD 56.23 Billion in 2019 to reach USD 415.35 Billion by 2027, at a CAGR of 28.5 % during the forecast period from 2020-2027.
Ontology has its roots as a field of philosophical study that is focused on the nature of existence. However, today's ontology (aka knowledge graph) can incorporate computable descriptions that can bring insight in a wide set of compelling applications including more precise knowledge capture, semantic data integration, sophisticated query answering, and powerful association mining - thereby delivering key value for health care and the life sciences. In this webinar, I will introduce the idea of computable ontologies and describe how they can be used with automated reasoners to perform classification, to reveal inconsistencies, and to precisely answer questions. Participants will learn about the tools of the trade to design, find, and reuse ontologies. Finally, I will discuss applications of ontologies in the fields of diagnosis and drug discovery.
Bio:
Dr. Michel Dumontier is an Associate Professor of Medicine (Biomedical Informatics) at Stanford University. His research focuses on the development of methods to integrate, mine, and make sense of large, complex, and heterogeneous biological and biomedical data. His current research interests include (1) using genetic, proteomic, and phenotypic data to find new uses for existing drugs, (2) elucidating the mechanism of single and multi-drug side effects, and (3) finding and optimizing combination drug therapies. Dr. Dumontier is the Stanford University Advisory Committee Representative for the World Wide Web Consortium, the co-Chair for the W3C Semantic Web for Health Care and the Life Sciences Interest Group, scientific advisor for the EBI-EMBL Chemistry Services Division, and the Scientific Director for Bio2RDF, an open source project to create Linked Data for the Life Sciences. He is also the founder and Editor-in-Chief for a Data Science, a new IOS Press journal featuring open access, open review, and semantic publishing.
ERP software for Doctor. Complete management software for hospital.Jyotindra Zaveri
View this slideshow to learn more about this affordable ERP software. Includes complete accounting module. Case paper management. Multi user, multiple location system. ERP is a fully comprehensive software for Doctors, Hospitals, and Clinics including prescription writing and storing, Electronic case papers records and medical practice management. ERP for Dr., focuses on helping the doctor and automating tasks. It is very simple to use, customizable to your workflow and suitable for both Specialists and General Practitioners.
Meenakshi Nagarajan,Amit Sheth,Selvam Velmurugan, "Citizen Sensor Data Mining, Social Media Analytics and Development Centric Web Applications," Tutorial at WWW2011, Hyderabad, India, March 28, 2011.
More info at:
http://knoesis.org/library/resource.php?id=1030
http://www2011india.com/tutorialstr27.html
The talk titled "Federated Architecture with Provenance and Access Control to realize Open Digital Data for MGI" given by prof. Amit Sheth at the ICMSE-MGI Digital Data Workshop held at Kno.e.sis Center from November 13-14 2013. The talk emphasized important issues that material scientists encounter in publishing data - Provenance and Access Control.
workshop page: http://wiki.knoesis.org/index.php/ICMSE-MGI_Digital_Data_Workshop
The talk titled "Realizing Semantic Web - Light Weight semantics and beyond" given by prof. T.K. Prasad at the ICMSE-MGI Digital Data Workshop held at Kno.e.sis Center from November 13-14 2013. The talk emphasized on annotation and search framework.
workshop page: http://wiki.knoesis.org/index.php/ICMSE-MGI_Digital_Data_Workshop
Cory Henson and Amit Sheth, Active Perception Over Machine and Citizen Sensing, SemTech 2011, June 2011.
http://semtech2011.semanticweb.com/sessionPop.cfm?confid=62&proposalid=3825
http://semantic-sensor-web.com
Amit Sheth with TK Prasad, "Semantic Technologies for Big Science and Astrophysics", Invited Plenary Presentation, at Earthcube Solar-Terrestrial End-User Workshop, NJIT, Newark, NJ, August 13, 2014.
Like many other fields of Big Science, Astrophysics and Solar Physics deal with the challenges of Big Data, including Volume, Variety, Velocity, and Veracity. There is already significant work on handling volume related challenges, including the use of high performance computing. In this talk, we will mainly focus on other challenges from the perspective of collaborative sharing and reuse of broad variety of data created by multiple stakeholders, large and small, along with tools that offer semantic variants of search, browsing, integration and discovery capabilities. We will borrow examples of tools and capabilities from state of the art work in supporting physicists (including astrophysicists) [1], life sciences [2], material sciences [3], and describe the role of semantics and semantic technologies that make these capabilities possible or easier to realize. This applied and practice oriented talk will complement more vision oriented counterparts [4].
[1] Science Web-based Interactive Semantic Environment: http://sciencewise.info/
[2] NCBO Bioportal: http://bioportal.bioontology.org/ , Kno.e.sis’s work on Semantic Web for Healthcare and Life Sciences: http://knoesis.org/amit/hcls
[3] MaterialWays (a Materials Genome Initiative related project): http://wiki.knoesis.org/index.php/MaterialWays
[4] From Big Data to Smart Data: http://wiki.knoesis.org/index.php/Smart_Data
The talk given by prof. Amit Sheth at the ICMSE-MGI Digital Data Workshop held at Kno.e.sis Center from November 13-14 2013. The talk showcased demos of successful applications that use semantic web technologies in several research problems.
workshop page: http://wiki.knoesis.org/index.php/ICMSE-MGI_Digital_Data_Workshop
A talk given in 2000 at IBM when it identified Taalee (then merged with Voquette), the Semantic Web company I founded, as one of the five exciting start ups.
"Computing for Human Experience: Semantics empowered Cyber-Physical, Social and Ubiquitous Computing beyond the Web" Keynote at On the Move Federated Conferences, Crete, Greece, October 18, 2011.
http://www.onthemove-conferences.org/
Details: http://wiki.knoesis.org/index.php/Computi
This is a version of series of talks given at NCSA-UIUC's director seminar, IBM Almaden, HP Labs, DERI-Galway, City Univ of Dublin, and KMI-Open University during Aug-Oct 2010 (replaces earlier keynote version). It deals with couple of items of the vision outlined at http://bit.ly/4ynB7A
A video of this presentation: http://www.ncsa.illinois.edu/News/Video/2010/sheth.html
Link to this talk as http://bit.ly/CHE-talk
Ajith Ranabahu, Priti Parikh, Maryam Panahiazar, Amit Sheth and Flora Logan-Klumpler: Kino : Making Semantic Annotations Easier, Presented at 5th Intl Conf on Semantic Computing (ICSC2011), Palo Alto, CA, September 2011.
Ignite talk at ICCM-2013 at United Nations (UN) Nairobi by NSF SoCS project researcher, Hemant Purohit - 'How to Leverage Social Media Communities for Crisis Response Coordination' using Human+Machine computing
Key-message: We need to extract smart actionable data out of big crisis data to assist response coordination, by focusing on demand-supply centric technology.
More at Kno.e.sis' SOCS project page: http://knoesis.org/research/semsoc/projects/socs
Also, Crisis Informatics at Kno.e.sis: http://j.mp/CrisisRes
Description - Ajith defended his thesis on application and data portability in cloud
computing. More details on Ajith's research and publications can be
found at http://knoesis.wright.edu/researchers/ajith/
Video can be found at : http://www.youtube.com/watch?v=oDBeBIIFmHc&list=UUORqXk1ZV44MOwpCorAROyQ&index=1&feature=plpp_video
This tutorial presents tools and techniques for effectively utilizing the Internet of Things (IoT) for building advanced applications, including the Physical-Cyber-Social (PCS) systems. The issues and challenges related to IoT, semantic data modelling, annotation, knowledge representation (e.g. modelling for constrained environments, complexity issues and time/location dependency of data), integration, analy- sis, and reasoning will be discussed. The tutorial will de- scribe recent developments on creating annotation models and semantic description frameworks for IoT data (e.g. such as W3C Semantic Sensor Network ontology). A review of enabling technologies and common scenarios for IoT applications from the data and knowledge engineering point of view will be discussed. Information processing, reasoning, and knowledge extraction, along with existing solutions re- lated to these topics will be presented. The tutorial summarizes state-of-the-art research and developments on PCS systems, IoT related ontology development, linked data, do- main knowledge integration and management, querying large- scale IoT data, and AI applications for automated knowledge extraction from real world data.
Related: Semantic Sensor Web: http://knoesis.org/projects/ssw
Physical-Cyber-Social Computing: http://wiki.knoesis.org/index.php/PCS
A lecture/conversation focusing on the first 12 years of Semantic Web - delivered on February 21, 2012.
See http://j.mp/SWIntro for more details. More detailed course material is at http://knoesis.org/courses/web3/
Research Objects: more than the sum of the partsCarole Goble
Workshop on Managing Digital Research Objects in an Expanding Science Ecosystem, 15 Nov 2017, Bethesda, USA
https://www.rd-alliance.org/managing-digital-research-objects-expanding-science-ecosystem
Research output is more than just the rhetorical narrative. The experimental methods, computational codes, data, algorithms, workflows, Standard Operating Procedures, samples and so on are the objects of research that enable reuse and reproduction of scientific experiments, and they too need to be examined and exchanged as research knowledge.
A first step is to think of Digital Research Objects as a broadening out to embrace these artefacts or assets of research. The next is to recognise that investigations use multiple, interlinked, evolving artefacts. Multiple datasets and multiple models support a study; each model is associated with datasets for construction, validation and prediction; an analytic pipeline has multiple codes and may be made up of nested sub-pipelines, and so on. Research Objects (http://researchobject.org/) is a framework by which the many, nested and contributed components of research can be packaged together in a systematic way, and their context, provenance and relationships richly described.
Mastering the variety dimension of Big Data with semantic technologies: high level intro to standards. Focus on variety/interoperability dimension. Prof Amit Sheth
NSF Workshop Data and Software Citation, 6-7 June 2016, Boston USA, Software Panel
FIndable, Accessible, Interoperable, Reusable Software and Data Citation: Europe, Research Objects, and BioSchemas.org
These slides were presented as part of a W3C tutorial at the CSHALS 2010 conference (http://www.iscb.org/cshals2010). The slides are adapted from a longer introduction to the Semantic Web available at http://www.slideshare.net/LeeFeigenbaum/semantic-web-landscape-2009 .
A PDF version of the slides is available at http://thefigtrees.net/lee/sw/cshals/cshals-w3c-semantic-web-tutorial.pdf .
Data Citation Implementation Guidelines By Tim Clarkdatascienceiqss
This talk presents a set of detailed technical recommendations for operationalizing the Joint Declaration of Data Citation Principles (JDDCP) - the most widely agreed set of principle-based recommendations for direct scholarly data citation.
We will provide initial recommendations on identifier schemes, identifier resolution behavior, required metadata elements, and best practices for realizing programmatic machine actionability of cited data.
We hope that these recommendations along with the new NISO JATS document schema revision, developed in parallel, will help accelerate the wide adoption of data citation in scholarly literature. We believe their adoption will enable open data transparency for validation, reuse and extension of scientific results; and will significantly counteract the problem of false positives in the literature.
Functional and Architectural Requirements for Metadata: Supporting Discovery...Jian Qin
The tremendous growth in digital data has led to an increase in metadata initiatives for different types of scientific data, as evident in Ball’s survey (2009). Although individual communities have specific needs, there are shared goals that need to be recognized if systems are to effectively support data sharing within and across all domains. This paper considers this need, and explores systems requirements that are essential for metadata supporting the discovery and management of scientific data. The paper begins with an introduction and a review of selected research specific to metadata modeling in the sciences. Next, the paper’s goals are stated, followed by the presentation of valuable systems requirements. The results include a base-model with three chief principles: principle of least effort, infrastructure service, and portability. The principles are intended to support “data user” tasks. Results also include a set of defined user tasks and functions, and applications scenarios.
Semantics in Financial Services -David NewmanPeter Berger
David Newman serves as a Senior Architect in the Enterprise Architecture group at Wells Fargo Bank. He has been following semantic technology for the last 3 years; and has developed several business ontologies. He has been instrumental in thought leadership at Wells Fargo on the application of Semantic Technology and is a representative of the Financial Services Technology Consortium (FSTC)on the W3C SPARQL Working Group.
Project Website: http://www.researchobject.org/
researchobjects.org is a community project that has developed an approach to describe and package up all resources used as part of an investigation as Research Objects (RO’s).
RO’s - provide two main features; a manifest - a consistent way to provide a well-typed, structured description of the resources used in an investigation; and a ‘bundle’ - a mechanism for packaging up manifests with resources as a single, publishable unit.
RO’s therefore carry the research context of an experiment - data, software, standard operating procedures (SOPs), models etc - and gather together the components of an experiment so that they are findable, accessible, interoperable and reproducible (FAIR). RO’s combine software and data into an aggregative data structure consisting of well described reconstructable parts.
RO’s have the potential to address a number of challenges pertinent to open research including: a) supporting interoperability between infrastructures by using ROs as a primary mechanism for exchange and publication b) supporting the evolution of research objects as a living collection, enabling provenance tracking c) providing the ability to pivot research object components (data, software, models) that are not restricted to the traditional publication.
Here we present work towards the development and adoption of ROs:
(i) A series of specifications and conventions, using community standards, for the RO manifest and RO bundles.
(ii) Implementations of Java, Python and Ruby APIs and tooling against those specifications;
(iii) Examples of representations of the RO models in various languages (e.g. JSON-LD, RDF, HTML).
Applied semantic technology and linked dataWilliam Smith
Mapping a human brain generates petabytes of gene listings and the corresponding locations of these genes throughout the human brain. Due to the large dataset a prototype Semantic Web application was created with the unique ability to link new datasets from similar fields of research, and present these new models to an online community. The resulting application presents a large set of gene to location mappings and provides new information about diseases, drugs, and side effects in relation to the genes and areas of the human brain.
In this presentation we will discuss the normalization processes and tools for adding new datasets, the user experience throughout the publishing process, the underlying technologies behind the application, and demonstrate the preliminary use cases of the project.
Talk about Exploring the Semantic Web, and particularly Linked Data, and the Rhizomer approach. Presented August 14th 2012 at the SRI AIC Seminar Series, Menlo Park, CA
Presented for managers & researchers at The Global One Health Initiative of the Ohio State University, Africa Regional Branch in Addis Ababa, Ethiopia (April 24th 2019)
Similar to Role of Semantic Web in Health Informatics (20)
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
How to Create Map Views in the Odoo 17 ERPCeline George
The map views are useful for providing a geographical representation of data. They allow users to visualize and analyze the data in a more intuitive manner.
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxEduSkills OECD
Andreas Schleicher presents at the OECD webinar ‘Digital devices in schools: detrimental distraction or secret to success?’ on 27 May 2024. The presentation was based on findings from PISA 2022 results and the webinar helped launch the PISA in Focus ‘Managing screen time: How to protect and equip students against distraction’ https://www.oecd-ilibrary.org/education/managing-screen-time_7c225af4-en and the OECD Education Policy Perspective ‘Students, digital devices and success’ can be found here - https://oe.cd/il/5yV
The Art Pastor's Guide to Sabbath | Steve ThomasonSteve Thomason
What is the purpose of the Sabbath Law in the Torah. It is interesting to compare how the context of the law shifts from Exodus to Deuteronomy. Who gets to rest, and why?
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
1. Role of Semantic Web
in Health Informatics
Tutorial at 2012 ACM SIGHIT International Health Informatics
Symposium (IHI 2012), January 28-30, 2012
Satya S. Sahoo, GQ Zhang AmitSheth
Division of Medical Informatics Kno.e.sis Center
Case Western Reserve University Wright State University
2. Outline
• Semantic Web
o Introductory Overview
• Clinical Research
o Physio-MIMI
• Bench Research and Provenance
o Semantic Problem Solving Environment for T.cruzi
• Clinical Practice
o Active Semantic Electronic Medical Record
4. Landscape of Health Informatics
Patient Care
Personalized Medicine
Drug Development
Clinical Research Bench Research
Privacy
Cost
Clinical Practice
* Images from case.edu
5. Challenges
• Information Integration: Reconcile heterogeneity
o Syntactic Heterogeneity: DOB vs. Date of Birth
o Structural Heterogeneity: Street + Apt + City vs.
Address
o Semantic Heterogeneity: Age vs. Age at time of surgery
vs. Age at time of admission
• Humans can (often) accurately interpret, but
extremely difficult for machine
o Role for Metadata/Contextual Information/Semantics
6. Semantic Web
• Web of Linked Data
• Introduced by Berners
Lee et. al as next step for
Web of Documents
• Allow “machine
understanding” of data,
• Create “common”
models of domains using
formal language -
Semantic Web Layer Cake
ontologies
Layer cake image source: http://www.w3.org
7. Resource Description Framework
Location
Company Armonk, New York,
United States
IBM
Zurich, Switzerland
• Resource Description Framework – Recommended by
W3C for metadata modeling [RDF]
• A standard common modeling framework – usable by
humans and machine understandable
8. RDF: Triple Structure, IRI, Namespace
Headquarters located in Armonk, New York,
IBM
United States
• RDF Triple
o Subject: The resource that the triple is about
o Predicate: The property of the subject that is described by the triple
o Object:The value of the property
• Web Addressable Resource:Uniform Resource Locator (URL), Uniform
Resource Identifier(URI), Internationalized Resource Identifier (IRI)
• Qualified Namespace:http://www.w3.org/2001/XMLSchema#
asxsd:
o xsd: string instead of http://www.w3.org/2001/XMLSchema#string
9. RDF Representation
• Two types of property values in a triple
o Web resource Headquarters located in
IBM Armonk, New York,
o Typed literal United States
Has total employees
IBM “430,000” ^^xsd:integer
• The graph model of RDF:node-arc-node is the primary
representation model
• Secondary notations: Triple notation
o companyExample:IBM companyExample:has-Total-
Employee “430,000”^^xsd:integer .
10. RDF Schema
Headquarters located in Armonk, New
IBM
York, United States
Headquarters located in Redwood Shores,
Oracle
California, United States
Headquarters located in
Company Geographical Location
• RDF Schema: Vocabulary for describing groups of
resources [RDFS]
11. RDF Schema
• Propertydomain(rdfs:domain) and range(rdfs:range)
Domain Headquarters located in Range
Company Geographical Location
• Class Hierarchy/Taxonomy:rdfs:subClassOf
SubClass rdfs:subClassOf (Parent) Class
Computer Technology Company
Company
Banking Company
Insurance Company
12. Ontology: A Working Definition
• Ontologies are shared conceptualizations of a
domain represented in a formal language*
• Ontologies in health informatics:
o Common representation model - facilitate
interoperability, integration across different projects,
and enforce consistent use of terminology
o Closely reflect domain-specific details (domain
semantics) essential to answer end user
o Support reasoning to discover implicit knowledge
* Paraphrased from Gruber, 1993
13. OWL2 Web Ontology Language
• A language for modeling ontologies [OWL]
• OWL2 is declarative
• An OWL2 ontology (schema) consists of:
o Entities:Company, Person
o Axioms:Company employs Person
o Expressions:A Person Employed by a Company =
CompanyEmployee
• Reasoning: Draw a conclusion given certain
constraints are satisfied
o RDF(S) Entailment
o OWL2 Entailment
14. OWL2 Constructs
• Class Disjointness: Instance of class A cannot be
instance of class B
• Complex Classes: Combining multiple classes with
set theory operators:
o Union:Parent =ObjectUnionOf(:Mother :Father)
o Logical negation:UnemployedPerson =
ObjectIntersectionOf(:EmployedPerson)
o Intersection:Mother =ObjectIntersectionOf(:Parent
:Woman)
15. OWL2 Constructs
• Property restrictions: defined over property
• Existential Quantification:
o Parent =ObjectSomeValuesFrom(:hasChild :Person)
o To capture incomplete knowledge
• Universal Quantification:
o US President = objectAllValuesFrom(:hasBirthPlace
United States)
• Cardinality Restriction
16. SPARQL: Querying Semantic Web Data
• A SPARQL query pattern composed of triples
• Triples correspond to RDF triple structure, but
have variable at:
o Subject: ?companyex:hasHeadquaterLocationex:NewYork.
o Predicate: ex:IBM?whatislocatedinex:NewYork.
o Object: ex:IBMex:hasHeadquaterLocation?location.
• Result of SPARQL query is list of values –
valuescan replace variable in query pattern
17. SPARQL: Query Patterns
• An example query pattern
PREFIX ex:<http://www.eecs600.case.edu/>
SELECT?company ?location WHERE
{?company ex:hasHeadquaterLocation?location.}
• Query Result
company location Multiple
Matches
IBM NewYork
Oracle RedwoodCity
MicorosoftCorporation Bellevue
18. SPARQL: Query Forms
• SELECT: Returns the values bound to the variables
• CONSTRUCT: Returns an RDF graph
• DESCRIBE: Returns a description (RDF graph) of
a resource (e.g. IBM)
o The contents of RDF graph is determined by SPARQL
query processor
• ASK: Returns a Boolean
o True
o False
20. Physio-MIMI Overview
• Physio-MIMI: Multi-Modality, Multi-Resource Environment for
Physiological and Clinical Research
• NCRR-funded, multi-CTSA-site project (RFP 08-001) for
providing informatics tools to clinical investigators and clinical
research teams at and across CTSA institutions to enhance the
collection, management and sharing of data
• Collaboration among Case Western, U Michigan, Marshfield
Clinic and U Wisconsin Madison
• Use Sleep Medicine as an exemplar, but also generalizable
• Two year duration: Dec 2008 – Dec 2010
21. Features of Physio-MIMI
• Federated data integration environment
– Linking existing data resources without a centralized data
repository
• Query interface directly usable by clinical researchers
– Minimize the role of the data-access middleman
• Secure and policy-compliant data access
– Fine-grained access control, dual SSL, auditing
• Tools for curatingPSGs
Data Integration Framework
Physio-MIMI
SHHS Portal
23. Measure not by the size of the database, but the
number of secondary studies it supported
24. Query Interface – driven by access
• Visual Aggregator and Explorer (VISAGE)
• Federated, Web-based
• Driven by Domain Ontology (SDO)
• PhysioMap to connect autonomous data sources
Clinical Clinical
Investigator Investigator
1 3 1 • GQ Zhang et al.
VISAGE: A Query Interface for Clinical
Data Analyst
Data Manager Database 3 Research, Proceedings of the 2010 AMIA
Clinical Research Informatics
2
2 Summit, San Francisco, March 12-13, pp.
Data Analyst
76-80, 2010
Database Data Manager
25. Physio-MIMI Components
Sleep Researcher Domain Expert Informatician
META SERVER
Query Builder
VISAGE
Query Manager Query Explorer
DB-Ontology Mapper
DATA SERVER
Institutional Databases Institutional Databases Institutional Databases
Institutional Firewall Institutional Firewall Institutional Firewall
28. Case Control Study Design
•Case-control is a common study design
• Used for epidemiological studies involving two cohorts,
one representing the cases
and the second representing the controls
• Adjusting matching ratio to improve statistical power
29. Example (CFS)
• Suppose we are interested in the question of whether
sleep parameters (EEG) differ by obesity in age and race
matched males
• Case: adult 55-75, male, BMI 35-50 (obese)
• Control: adult 55-75, male, BMI 20-30 (non-obese)
• Matching 1:2 on race (minimize race as a factor initially)
35. 1:5 Matching – CFS+SHHS
Modify Control to Include
TWO data sources
36. Sleep Domain Ontology (SDO)
• Standardize terminology and semantics (define variations) [RO]
• Facilitate definition of data elements
• Valuable for data collection, data curation
• Data integration
• Data sharing and access
• Take advantage of progress in related areas (e.g. Gene Ontology)
• Improving data quality – provenance, reproducibility
43. Provenance in Scientific Experiments
Gene
Name
Sequence
Extraction
Drug
3‘ & 5’
Resistant
Region
Plasmid Gene Name
Plasmid
Construction
Knockout
T.Cruzi
Construct Plasmid
sample
?
Transfection
Transfected
Sample
Drug Cloned Sample
Selection
Selected
Sample
Cell
Cloning
Cloned
Sample
44. Provenance in Scientific Experiments
Gene
Name
• Provenance from the French word
“provenir” describes the lineage or
Sequence
Extraction
Drug
Resistant
Plasmid
3‘ & 5’
Region history of a data entity
Plasmid
Construction
• For Verification and Validation of
T.Cruzi
Knockout
Construct Plasmid
Data Integrity, Process Quality, and
Trust
sample
Transfection
Transfected
• Semantic Provenance Framework
Sample
addresses three aspects [Prov]
o Provenance Modeling
Drug
Selection
Selected
Sample o Provenance Query Infrastructure
Cell
Cloning o Scalable Provenance System
Cloned
Sample
46. Provenance Query Classification
Classified Provenance Queries into Three Categories
• Type 1: Querying for Provenance Metadata
o Example: Which gene was used create the cloned sample with ID = 66?
• Type 2: Querying for Specific Data Set
o Example: Find all knockout construct plasmids created by researcher
Michelle using “Hygromycin” drug resistant plasmid between April 25, 2008
and August 15, 2008
• Type 3: Operations on Provenance Metadata
o Example: Were the two cloned samples 65 and 46 prepared under
similar conditions – compare the associated provenance
information
47. Provenance Query Operators
Four Query Operators – based on Query
Classification
• provenance () – Closure operation, returns the
complete set of provenance metadata for input data
entity
• provenance_context() - Given set of constraints
defined on provenance, retrieves datasets that
satisfy constraints
• provenance_compare () - adapt the RDF graph
equivalence definition
• provenance_merge () - Two sets of provenance
information are combined using the RDF graph
merge
49. Implementation: Provenance Query Engine
QUERY
OPTIMIZER
• Three modules:
o Query Composer
o Transitive closure
o Query Optimizer
• Deployable over a
RDF store with
support for
reasoning TRANSITIVE CLOSURE
50. Application in T.cruzi SPSE Project
• Provenance tracking
for gene knockout,
strain creation,
proteomics, microarray
experiments
• Part of the Parasite
Knowledge Repository
[BKR]
51. W3C Provenance Working Group
• Define a “provenance interchange language for
publishing and accessing provenance”
• Three working drafts:
o PROV-Data Model: A conceptual model for
provenance representation
o PROV-Ontology: An OWL ontology for provenance
representation
o PROV-Access and Query: A framework to query
and retrieve provenance on the Web
53. Semantic Web application in use
In daily use at Athens Heart Center
– 28 person staff
• Interventional Cardiologists
• Electrophysiology Cardiologists
– Deployed since January 2006
– 40-60 patients seen daily
– 3000+ active patients
– Serves a population of 250,000 people
54. Information Overload in Clinical
Practice
• New drugs added to market
– Adds interactions with current drugs
– Changes possible procedures to treat an illness
• Insurance Coverage's Change
– Insurance may pay for drug X but not drug Y even
though drug X and Y are equivalent
– Patient may need a certain diagnosis before some
expensive test are run
• Physicians need a system to keep track of ever
changing landscape
59. Active Semantic Document (ASD)
A document (typically in XML) with the following features:
• Semantic annotations
– Linking entities found in a document to ontology
– Linking terms to a specialized lexicon [TR]
• Actionable information
– Rules over semantic annotations
– Violated rules can modify the appearance of the document (Show an
alert)
60. Active Semantic Patient Record
• An application of ASD
• Three Ontologies
– Practice
Information about practice such as patient/physician data
– Drug
Information about drugs, interaction, formularies, etc.
– ICD/CPT
Describes the relationships between CPT and ICD codes
• Medical Records in XML created from database
61. Practice Ontology Hierarchy
(showing is-a relationships)
facility
insurance_
ancillary owl:thing carrier
ambularory insurance
_episode
insurance_
encounter
plan
person
event insurance_
patient policy
practitioner
65. Semantic Technologies in Use
• Semantic Web: OWL, RDF/RDQL, Jena
– OWL (constraints useful for data consistency), RDF
– Rules are expressed as RDQL
– REST Based Web Services: from server side
• Web 2.0: client makes AJAX calls to ontology, also auto
complete
Problem:
• Jena main memory- large memory footprint, future scalability
challenge
• Using Jena’s persistent model (MySQL) noticeably slower
67. Benefits: Athens Heart Center Practice
Growth
1400
1300
1200
1100
Appointments
1000 2003
900
2004
800
2005
700
2006
600
500
400
v
b
g
r
c
n
n
l
p
ar
t
ay
ju
ap
no
oc
fe
ja
ju
au
de
se
m
m
Month
68. Chart Completion before the preliminary
deployment of the ASMER
600
500
400
Charts
Same Day
300
Back Log
200
100
0
Se 4
5
04
05
04
05
04
05
04
04
l0
l0
n
n
ay
ay
pt
ar
ar
ov
Ju
Ju
Ja
Ja
M
M
M
M
N
Month/Year
69. Chart Completion after the preliminary
deployment of the ASMER
700
600
500
Charts
400 Same Day
300 Back Log
200
100
0
Sept Nov 05 Jan 06 Mar 06
05
Month/Year
70. Benefits of current system
• Error prevention (drug interactions, allergy)
– Patient care
– insurance
• Decision Support (formulary, billing)
– Patient satisfaction
– Reimbursement
• Efficiency/time
– Real-time chart completion
– “semantic” and automated linking with billing
71. Demo
On-line demo of Active Semantic Electronic Medical Record
deployed and in use at Athens Heart Center
71
73. Conclusions
Benefits of SW in Health Informatics:
• RDF a “universal” data model; Application-
purpose agnostic (clinical care vs research)
• Integration “ready,” supporting distributed query
out of box
• Semantic interoperability addressed at root level
• Better support of user interfaces for data capture,
data query, data integration
• Scalability demonstrated
74. Challenges and Future Directions
• Design and implementation of health information systems
with RDF as primary data store from ground up
• User-friendly graphical query interface on top of SPARQL
• Managing Protected Health Information (PHI) e.g. data
encryption “at rest” for RDF store
• From retrospective annotation of data (with ontology) to
prospective annotation of data: ontology-driven data capture
with annotation happening at the point of primary source
(eliminating the need to annotate data retrospectively)
• Let ontology drive “everything”
75. References
• [RDF] Manola F, Miller, E.(Eds.). RDF Primer. 2004; Available from:
http://www.w3.org/TR/rdf-primer/
• [RDFS] Brickley D, Guha, R.V. RDF Schema. 2004; Available from:
http://www.w3.org/TR/rdf-schema/
• [OWL] Hitzler P, Krötzsch, M., Parsia, B., Patel-Schneider, P.F., Rudolph, S. OWL 2
Web Ontology Language Primer: W3C; 2009
• [Physio-MIMI]: http://physiomimi.case.edu
• [ASEMR] A. P. Sheth, Agrawal, S., Lathem, J., Oldham, N., Wingate, H., Yadav, P.,
Gallagher, K., "Active Semantic Electronic Medical Record," in 5th International
Semantic Web Conference, Athens, GA, USA, 2006.
• [BioRDF] BioRDF subgroup: Health Care and Life Sciences interest group Available:
http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup
• [TR] A. Ruttenberg, et al., "Advancing translational research with the Semantic Web,"
BMC Bioinformatics vol. in Press, 2007.
76. References 2
• [Visage] GQ Zhang et al. VISAGE: A Query Interface for Clinical Research,
Proceedings of the 2010 AMIA Clinical Research Informatics Summit, San Francisco,
March 12-13, pp. 76-80, 2010
• [Prov] S.S. Sahoo, V. Nguyen, O. Bodenreider, P. Parikh, T. Minning, A.P. Sheth, “A
unified framework for managing provenance information in translational research.”
BMC Bioinformatics 2011, 12:461
• [RO] Smith B, Ceusters W, Klagges B, Kohler J, Kumar A, Lomax J, Mungall C,
Neuhaus F, Rector AL, Rosse C: Relations in biomedical ontologies. Genome Biol
2005, 6(5):R46.
• [BKR] Bodenreider O, Rindflesch, T.C.: Advanced library services: Developing a
biomedical knowledge repository to support advanced information management
applications. In. Bethesda, Maryland: Lister Hill National Center for Biomedical
Communications, National Library of Medicine; 2006.
• T.cruzi project web site: http://wiki.knoesis.org/index.php/Trykipedia
77. Acknowledgements
• Collaborators:
o Susan Redline, Remo Mueller, and other members of
Physio-MIMI team
o Rick Tarleton, Todd Manning, Priti Parikh and other
members of the T.cruzi SPSE team
o Dr. S. Agrawal and other members at the Athens Heart
Center, GA
• NIH Support: UL1-RR024989, UL1-RR024989-05S,
NCRR-94681DBS78, NS076965, and 1R01HL087795
Editor's Notes
RDF: Triple structure
Review types of heterogeneity. Why we need to reconcile data heterogeneityUniform Resource Locator: A network location and used as an identifier for resources on the Web. URL is a specific type of URI. URI can be used to refer to anythingIRI: In addition to ASCII character set, contains Universal Character Set (from RFC 3987)
RDF uses XML Schema datatypes
Allows creation of an abstract representation of domain
Allows creation of an abstract representation of domain
Review types of heterogeneity. Why we need to reconcile data heterogeneity
Review types of heterogeneity. Why we need to reconcile data heterogeneity
Review types of heterogeneity. Why we need to reconcile data heterogeneity
Better yet, for those who are not originally involved in developing the data resource
Web based, anywhere anytime, not requiring knowledge of data structure, data elements, how they are stored
Walking through an example to illustrate the VisAgE’s Case Control features. Cleveland family study
Green means matched; visually inspect PieChart
Try to provide 1:5 matching; not enough controls; COLOER indicator
Controls coming from both Cleveland Family Study and Sleep Heart Health Study gives sufficient number of matched controls.This also illustrates the POWER of data federation in PhysioMIMI: combine different data sources
Can accomondate differences, but prevent the same term to change meaning from one paragraph to the next in the same paper.
Lets take another example from the NIH-funded collaborative project led by our lab along with the CTEGD at UGA to identify vaccine targets for the human pathogen T.cruzi that causes… specifically, we consider the experiment protocol used to create new strains of the parasites by knocking out specific genes in the parasite – to identify function of the gene among other functionalities.Proposal writing
In terms of modeling: A formal representation is needed to support consistent interpretation (also machine processable for large volumes of data) and expressive to closely reflect the domain-specific details i.e. domain semanticsProvenance queries have many characteristics that I will cover later that need to be supported by the query infrastructure.Provenance queries over scientific data are characterized by high expression and data complexity (I will discuss these aspects in detail)Lets discuss the issue of provenance modeling first.
In terms of modeling: A formal representation is needed to support consistent interpretation (also machine processable for large volumes of data) and expressive to closely reflect the domain-specific details i.e. domain semanticsProvenance queries have many characteristics that I will cover later that need to be supported by the query infrastructure.Provenance queries over scientific data are characterized by high expression and data complexity (I will discuss these aspects in detail)Lets discuss the issue of provenance modeling first.
We can extend the provenir ontology to model domain-specific provenance. For example, we extended the provenir ontology to create the peo that models the gene knockout and strain creation protocols. We have also extended provenir to model the trident ontology representing oceanography specific provenance information – will discuss later in evaluation section.PEO has a expressivity of ALCHQ(D) is because of use qualifiers on the properties, for example cell cloning has input value drug selected sample and output value of cloned sample, datatype property for researcher notes
First types of queries are “standard” provenance queryThe second type of queries is a complete reverse view: define constraints on provenance information to retrieve datasets that satisfy those constraints (provenance context)
Formal definitions and implemented in prolog (to validate functional semantics) and mapped to SPARQL – the RDF query language
Query composer: maps the functional semantics of the query operator to SPARQL syntax in other words, creates a SPARQL query pattern that conforms to the required behavior of the query operator. One of the challenges we faced in implementing this query composer is the use highly nested OPTIONAL function to account for the fact that all application do not collect the comprehensive provenance information that the query operators have been defined to operate on – hence is the query operators are translated to a basic graph pattern in SPARQL then it will return no results even though partial information is present. The OPTIONAL function allows us to sidestep this, but we had to map the nesting of the OPTIONAL functions to reflect the structure of the Provenir ontology schema, for example given a process, the link to an agent requires use of top-level OPTIONAL function, but the OPTIONAL function to retrieve the spatial parameters associated with the agent need to be nested within the top-level OPTIONAL functionI discussed the transitive closure function earlier, this is implemented using existing SPARQL function ASK – I will discuss the experiment results to justify the use of ASKThe query optimizer uses materialized view based approach to significantly improve query performance. The entities in the materialized view are indexed in B+ tree and in response to a query, the query optimizer looks up this index to see if a query can be answered using the materialized view or needs to sent to the DB. The query optimizer also includes a module to do “view selection” that is to decide whether to materialize a query result or not – I will expand on this in a few slides