The document presents a method for automatically mapping terms from clinical encounter forms to concepts in SNOMED CT. It exploits the semantic structure of forms by analyzing the context and relationships between terms. A naive Bayes classifier is trained on semantic attributes derived from the form structure to determine the appropriate SNOMED CT semantic category for each term. Evaluation on 26 forms showed the hybrid approach of combining linguistic techniques and semantic structure outperformed a baseline, achieving a precision of 0.89 and recall of 0.76.
This document proposes a framework to map clinician-specified form terms to standardized SNOMED CT concepts by leveraging the semantic structure of clinical forms. It presents a hybrid approach that uses both linguistic matching and structural context to address challenges from term diversity and context. An empirical study on 26 real-world forms shows the hybrid method improves mapping precision by up to 18% and recall by up to 30% compared to baselines. The work demonstrates how semantic form structures can help address context and improve mapping between clinical terms and standardized concepts.
Modeling XCS in class imbalances: Population sizing and parameter settingskknsastry
This paper analyzes the scalability of the population size required in XCS to maintain niches that are infrequently activated. Facetwise models have been developed to predict the effect of the imbalance ratio—ratio between the number of instances of the majority class and the minority class that are sampled to XCS—on population initialization, and on the creation and deletion of classifiers of the minority class. While theoretical models show that, ideally, XCS scales linearly with the imbalance ratio, XCS with standard configuration scales exponentially.
The causes that are potentially responsible for this deviation from the ideal scalability are also investigated. Specifically, the inheritance procedure of classifiers’ parameters, mutation, and subsumption are analyzed, and improvements in XCS’s mechanisms are proposed to effectively and efficiently handle imbalanced problems. Once the recommendations are incorporated to XCS, empirical results show that the population size in XCS indeed scales linearly with the imbalance ratio.
Substructrual surrogates for learning decomposable classification problems: i...kknsastry
This paper presents a learning methodology based on a substructural classification model to solve decomposable classification problems. The proposed method consists of three important components: (1) a structural model that represents salient interactions between attributes for a given data, (2) a surrogate model which provides a functional approximation of the output as a function of attributes, and (3) a classification model which predicts the class for new inputs. The structural model is used to infer the functional form of the surrogate and its coefficients are estimated using linear regression methods. The classification model uses a maximally-accurate, least-complex surrogate to predict the output for given inputs. The structural model that yields an optimal classification model is searched using an iterative greedy search heuristic. Results show that the proposed method successfully detects the interacting variables in hierarchical problems, group them in linkages groups, and build maximally accurate classification models. The initial results on non-trivial hierarchical test problems indicate that the proposed method holds promise and have also shed light on several improvements to enhance the capabilities of the proposed method.
The document describes a research informatics software system called Labmatrix. It allows users to track clinical and research data in a hierarchical and relational manner. The system facilitates secure collaborative data entry and easy ad-hoc data retrieval. It can capture detailed patient and specimen data, generate barcodes, track specimen storage locations and lineage. Users can update multiple records simultaneously and generate sample records using predefined workflows. Data can be imported from various sources and normalized. The system includes graphical tools for exploring and querying the data without requiring SQL or programming skills.
This document summarizes research on developing a computational model to identify pathological mutations in Fabry disease. The researchers built a neural network model using 7 properties of mutations including sequence, structure, and multiple sequence alignment data. They trained and tested the model on a dataset of 313 pathological and 59 neutral mutations, achieving a sensitivity of 0.85 and specificity of 0.92. The model outperformed general tools like Polyphen-2 in predicting mutations for Fabry disease. Future work will focus on expanding the approach to additional genes and disease phenotypes.
This document describes a study aimed at creating a gold standard dataset of drug indications extracted from FDA drug labels. The researchers developed a semi-automatic method using natural language processing and expert annotation to identify drug-disease treatment relationships from 100 randomly selected drug labels. Two expert annotators worked independently to accept or reject automatically identified candidate indications, with their common judgments considered the gold standard. Results showed the experts achieved near-perfect joint precision and an average F1-measure of 0.95. Through iterative error analysis and guideline updates, agreement between the annotators improved from 76.2% initially to 93.9%, demonstrating the viability of the semi-automatic method for creating a structured, specific gold standard of drug indications from DailyMed drug labels
Crowdsourcing via Amazon Mechanical Turk was used to collect annotations for 5 natural language tasks: affect recognition, word similarity, recognizing textual entailment, event temporal ordering, and word sense disambiguation. The study found that while individual non-experts were less reliable than experts, aggregating responses from multiple non-experts could produce annotations equivalent or comparable to experts. Specifically, annotations from 4 non-experts on average produced similar reliability as 1 expert for affect recognition, and classifiers trained on crowdsourced data performed comparable to those trained on expert annotations.
This document describes a study on matching conceptual models by comparing entities across different models. The study represented 20 conceptual models as structured tables with information on each entity's name, attributes, and relationships. It then generated a dataset comparing each pair of entities across models based on name, attribute, and relationship similarity metrics. Binary logistic regression was used to analyze how well each metric predicted if entities actually matched, finding that only name similarity was a significant predictor. The study aims to improve the similarity functions and classification approach to better match conceptual models.
This document proposes a framework to map clinician-specified form terms to standardized SNOMED CT concepts by leveraging the semantic structure of clinical forms. It presents a hybrid approach that uses both linguistic matching and structural context to address challenges from term diversity and context. An empirical study on 26 real-world forms shows the hybrid method improves mapping precision by up to 18% and recall by up to 30% compared to baselines. The work demonstrates how semantic form structures can help address context and improve mapping between clinical terms and standardized concepts.
Modeling XCS in class imbalances: Population sizing and parameter settingskknsastry
This paper analyzes the scalability of the population size required in XCS to maintain niches that are infrequently activated. Facetwise models have been developed to predict the effect of the imbalance ratio—ratio between the number of instances of the majority class and the minority class that are sampled to XCS—on population initialization, and on the creation and deletion of classifiers of the minority class. While theoretical models show that, ideally, XCS scales linearly with the imbalance ratio, XCS with standard configuration scales exponentially.
The causes that are potentially responsible for this deviation from the ideal scalability are also investigated. Specifically, the inheritance procedure of classifiers’ parameters, mutation, and subsumption are analyzed, and improvements in XCS’s mechanisms are proposed to effectively and efficiently handle imbalanced problems. Once the recommendations are incorporated to XCS, empirical results show that the population size in XCS indeed scales linearly with the imbalance ratio.
Substructrual surrogates for learning decomposable classification problems: i...kknsastry
This paper presents a learning methodology based on a substructural classification model to solve decomposable classification problems. The proposed method consists of three important components: (1) a structural model that represents salient interactions between attributes for a given data, (2) a surrogate model which provides a functional approximation of the output as a function of attributes, and (3) a classification model which predicts the class for new inputs. The structural model is used to infer the functional form of the surrogate and its coefficients are estimated using linear regression methods. The classification model uses a maximally-accurate, least-complex surrogate to predict the output for given inputs. The structural model that yields an optimal classification model is searched using an iterative greedy search heuristic. Results show that the proposed method successfully detects the interacting variables in hierarchical problems, group them in linkages groups, and build maximally accurate classification models. The initial results on non-trivial hierarchical test problems indicate that the proposed method holds promise and have also shed light on several improvements to enhance the capabilities of the proposed method.
The document describes a research informatics software system called Labmatrix. It allows users to track clinical and research data in a hierarchical and relational manner. The system facilitates secure collaborative data entry and easy ad-hoc data retrieval. It can capture detailed patient and specimen data, generate barcodes, track specimen storage locations and lineage. Users can update multiple records simultaneously and generate sample records using predefined workflows. Data can be imported from various sources and normalized. The system includes graphical tools for exploring and querying the data without requiring SQL or programming skills.
This document summarizes research on developing a computational model to identify pathological mutations in Fabry disease. The researchers built a neural network model using 7 properties of mutations including sequence, structure, and multiple sequence alignment data. They trained and tested the model on a dataset of 313 pathological and 59 neutral mutations, achieving a sensitivity of 0.85 and specificity of 0.92. The model outperformed general tools like Polyphen-2 in predicting mutations for Fabry disease. Future work will focus on expanding the approach to additional genes and disease phenotypes.
This document describes a study aimed at creating a gold standard dataset of drug indications extracted from FDA drug labels. The researchers developed a semi-automatic method using natural language processing and expert annotation to identify drug-disease treatment relationships from 100 randomly selected drug labels. Two expert annotators worked independently to accept or reject automatically identified candidate indications, with their common judgments considered the gold standard. Results showed the experts achieved near-perfect joint precision and an average F1-measure of 0.95. Through iterative error analysis and guideline updates, agreement between the annotators improved from 76.2% initially to 93.9%, demonstrating the viability of the semi-automatic method for creating a structured, specific gold standard of drug indications from DailyMed drug labels
Crowdsourcing via Amazon Mechanical Turk was used to collect annotations for 5 natural language tasks: affect recognition, word similarity, recognizing textual entailment, event temporal ordering, and word sense disambiguation. The study found that while individual non-experts were less reliable than experts, aggregating responses from multiple non-experts could produce annotations equivalent or comparable to experts. Specifically, annotations from 4 non-experts on average produced similar reliability as 1 expert for affect recognition, and classifiers trained on crowdsourced data performed comparable to those trained on expert annotations.
This document describes a study on matching conceptual models by comparing entities across different models. The study represented 20 conceptual models as structured tables with information on each entity's name, attributes, and relationships. It then generated a dataset comparing each pair of entities across models based on name, attribute, and relationship similarity metrics. Binary logistic regression was used to analyze how well each metric predicted if entities actually matched, finding that only name similarity was a significant predictor. The study aims to improve the similarity functions and classification approach to better match conceptual models.
The document introduces 3 database research projects at the Center for Women's Health Research:
1. Clinical Form Encoding seeks to standardize clinical terms on forms using clinical coding schemes like SNOMED CT.
2. EMR Error Detection aims to improve electronic medical record systems by more robustly integrating clinical guidelines to reduce data entry errors.
3. A Query and Data Extraction Tool is being developed to help researchers and providers more easily access and analyze the large amounts of patient data stored in databases.
This document outlines a marketing plan to promote the use of MSG in small amounts for street food vendors in Indonesia called "Abang Nasgor". It proposes using free stools labeled "ask for less msg" and encouraging customers to recommend vendors using less MSG. New MSG packaging will suggest appropriate amounts per portion. Celebrities will visit the most recommended vendors using less MSG to increase awareness. The plan aims to change negative perceptions of MSG and show that smaller amounts do not cause health issues.
The document discusses search interface understanding (SIU), which involves representing, parsing, segmenting, and evaluating search interfaces on the deep web. SIU is challenging because search interfaces are designed autonomously without standard structures. The document outlines the SIU process and key challenges, such as interfaces having no defined boundaries for segmenting semantically related components. Techniques for SIU include rules, heuristics, and machine learning.
This document summarizes a study on using Hidden Markov Models (HMMs) for search interface segmentation. The researchers applied a two-layered HMM approach, with the first layer tagging interface components with semantic labels and the second layer segmenting the interface. Their experiments showed domain-specific HMMs performed best on interfaces from the same domain, while cross-domain HMMs captured patterns across domains. The study contributed an effective probabilistic approach to interface segmentation and found appropriate training data is key to accurate segmentation across domains.
The document summarizes the results of a survey conducted by DHP Research about blogs and how to improve their blog. Key findings include:
- Peer reviewed papers, white papers, and case studies are important for readers to learn about issues affecting their day-to-day work.
- Blogs should be informative, interesting, accessible, and innovative. Readers are less interested in controversial or number of blogs posted per week.
- Over 85% would read a 250 word blog, but only 14% would definitely read a 1000 word blog.
- Readers want blogs to discuss regulatory use of PROs, empowering patients, research methodologies, and quantitative data analysis methods.
- Based on the results,
8 things you should not do when selecting a premKeith Meadows
This document provides 8 tips for what not to do when selecting a patient-reported experience measure (PREM). It advises against using a PREM just because a colleague used it, or because it was developed by a colleague, without evidence of its reliability and validity. It also warns against using PREMs developed without patient input, for different patient groups, or without understanding how the information will be used. The document stresses the importance of properly developing and validating PREMs to ensure the right information is collected.
The Diabetes Health Profile - Development and applicationsKeith Meadows
This document summarizes information about diabetes in the UK and an assessment tool called the Diabetes Health Profile (DHP-18). It notes that as of 2011 there were 2.9 million diagnosed cases of diabetes in the UK, with 90% being type 2 diabetes. The DHP-18 is an 18-item questionnaire that measures psychological distress, barriers to activity, and disinhibited eating in people with diabetes. It has been psychometrically validated. The document discusses interpreting DHP-18 scores and the minimum important differences for each domain.
White paper 5 things you need to know about patient reported outcome (pro) ...Keith Meadows
1) The document discusses key factors to consider when selecting a patient-reported outcome measure (PROM) for use in a study.
2) It is important to select a PROM that is reliable, valid, and measures the specific outcomes of interest related to the disease and treatment.
3) The document provides guidance on the differences between generic and disease-specific PROMs and when each type may be most appropriate. It also discusses important psychometric properties like reliability and validity that should be considered.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive function. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
Oper Semangat: a campaign to gain Indonesian football supporter's optimist sp...Faldi Dwi Wahyudi
The document discusses a campaign to boost support and morale for Indonesia's national football team, which has struggled in recent years. Fans uploaded short videos expressing their enthusiastic support for individual players to address negativity from poor performance. Over 100 videos were compiled into a single video to show widespread continued support and be played at an upcoming match. The football association praised the effort and its goal of demonstrating that many fans still care about and never give up on the national team.
Our story of understanding of what its like living with diabetesKeith Meadows
The Diabetes Health Profile (DHP) is a standardized measure of the psychological and behavioral impact of diabetes. It was initially developed in 1986 through patient interviews and literature reviews to create the 32-item DHP-1 conceptual framework. Between 1996-2000, this was refined into the 18-item DHP-18 framework through analysis of over 2000 patients. The DHP has since been used widely to measure diabetes outcomes and identify groups most at risk of psychological distress, with ongoing development and licensing through DHP Research and Isis Outcomes.
This document provides tips for selecting a patient reported outcome measure (PROM) for research studies. It recommends:
1. Formulating a clear hypothesis about what concept is being measured to identify the appropriate PROM.
2. Ensuring the PROM's content and items are relevant to the target patient population and condition.
3. Considering the PROM's acceptability, including length, time to complete, and easy to understand language and format.
4. Choosing a PROM that has been scientifically developed and validated to reliably measure the intended concept.
5. Understanding how to correctly analyze and interpret the PROM data and scores.
This document discusses using hidden Markov models to automatically discover the structure of clinical forms and annotate them with medical terminology. It presents a two-layer hidden Markov model approach to first assign tags like category and field to form elements, and then group related elements to identify form segments. The method was tested on 52 clinical forms and achieved over 95% accuracy in extracting the underlying structure of the forms in the form of trees. The ability to automatically understand form structure and annotate forms could enable more flexible design of electronic health records.
This document outlines a challenge to increase appreciation of local Indonesian products. It proposes convincing people that supporting local products shows nationalism. Currently, local products have an image of being cheap and low quality. The solution is to give people a unique way to appreciate products by determining their own price. Paying the chosen price would help fund better materials and design research to improve quality. Higher prices would contribute more to quality improvements. Links on price tags would direct people to videos of local craftspeople. Products would be displayed at supermarket cashiers.
Mike Thelwall is a professor known for his research in the field of webometrics. He received his PhD in mathematics and leads the Statistical Cybermetrics Research Group. Webometrics involves the quantitative analysis of web phenomena such as link analysis, search engine evaluation, and web citation analysis. Thelwall's research has explored using webometrics to study the dissemination of scholarly research and evaluate universities. He has emphasized the need for conceptual frameworks and methodologies to interpret webometrics results and address challenges like the size and changing nature of the web.
Clinicians rely on health information technologies (HITs) for clinical data collection, but current HITs are inflexible and inconsistent with clinicians' needs. The researchers propose a flexible electronic health record (fEHR) system to allow clinicians to easily modify the system based on their changing data collection needs. The fEHR uses a form-based interface for clinicians to design forms, generates a corresponding form tree structure, and designs a high-quality database from the tree. A user study with 5 nurses found they could effectively replicate needs in the system and their efficiency and understanding improved over two rounds of tasks of increasing complexity. The researchers conclude the fEHR has potential to reduce HIT problems and that the database design
This document summarizes a study on the remote mentoring program called MAGIC (Get More Active Girls in Computing). MAGIC aims to increase female participation in STEM fields through one-on-one remote mentoring matches between young girls and women professionals in technology careers. The study analyzed data from MAGIC's first 5 years, finding that remote mentoring increased STEM skills, self-confidence, and career awareness for many mentees. However, challenges included maintaining mentor and mentee commitment over time. The study concludes that remote mentoring shows promise for improving gender diversity in STEM, but more data is needed to better understand impacts and how to address challenges.
A brief introduction to SNOMED CT - the ontology based medical terminology. This covers the basic definitions, the difference between SNOMED CT and ICD9, Post co-ordination use-cases and some general information.
This is not an extensive guide for SNOMED CT adoption in a system
This document discusses the origins and development of the LOINC Clinical Document Ontology (CDO), which provides a standardized terminology for clinical document names. It describes how the CDO was created based on empirical analysis of over 2000 local document names. The CDO uses a multi-axial model with domains like subject matter, role, setting, type of service, and kind of document. Iterative evaluations found the expanded CDO better mapped local names than the original. Ongoing work involves adding new content and harmonizing with other clinical terminologies.
The document introduces 3 database research projects at the Center for Women's Health Research:
1. Clinical Form Encoding seeks to standardize clinical terms on forms using clinical coding schemes like SNOMED CT.
2. EMR Error Detection aims to improve electronic medical record systems by more robustly integrating clinical guidelines to reduce data entry errors.
3. A Query and Data Extraction Tool is being developed to help researchers and providers more easily access and analyze the large amounts of patient data stored in databases.
This document outlines a marketing plan to promote the use of MSG in small amounts for street food vendors in Indonesia called "Abang Nasgor". It proposes using free stools labeled "ask for less msg" and encouraging customers to recommend vendors using less MSG. New MSG packaging will suggest appropriate amounts per portion. Celebrities will visit the most recommended vendors using less MSG to increase awareness. The plan aims to change negative perceptions of MSG and show that smaller amounts do not cause health issues.
The document discusses search interface understanding (SIU), which involves representing, parsing, segmenting, and evaluating search interfaces on the deep web. SIU is challenging because search interfaces are designed autonomously without standard structures. The document outlines the SIU process and key challenges, such as interfaces having no defined boundaries for segmenting semantically related components. Techniques for SIU include rules, heuristics, and machine learning.
This document summarizes a study on using Hidden Markov Models (HMMs) for search interface segmentation. The researchers applied a two-layered HMM approach, with the first layer tagging interface components with semantic labels and the second layer segmenting the interface. Their experiments showed domain-specific HMMs performed best on interfaces from the same domain, while cross-domain HMMs captured patterns across domains. The study contributed an effective probabilistic approach to interface segmentation and found appropriate training data is key to accurate segmentation across domains.
The document summarizes the results of a survey conducted by DHP Research about blogs and how to improve their blog. Key findings include:
- Peer reviewed papers, white papers, and case studies are important for readers to learn about issues affecting their day-to-day work.
- Blogs should be informative, interesting, accessible, and innovative. Readers are less interested in controversial or number of blogs posted per week.
- Over 85% would read a 250 word blog, but only 14% would definitely read a 1000 word blog.
- Readers want blogs to discuss regulatory use of PROs, empowering patients, research methodologies, and quantitative data analysis methods.
- Based on the results,
8 things you should not do when selecting a premKeith Meadows
This document provides 8 tips for what not to do when selecting a patient-reported experience measure (PREM). It advises against using a PREM just because a colleague used it, or because it was developed by a colleague, without evidence of its reliability and validity. It also warns against using PREMs developed without patient input, for different patient groups, or without understanding how the information will be used. The document stresses the importance of properly developing and validating PREMs to ensure the right information is collected.
The Diabetes Health Profile - Development and applicationsKeith Meadows
This document summarizes information about diabetes in the UK and an assessment tool called the Diabetes Health Profile (DHP-18). It notes that as of 2011 there were 2.9 million diagnosed cases of diabetes in the UK, with 90% being type 2 diabetes. The DHP-18 is an 18-item questionnaire that measures psychological distress, barriers to activity, and disinhibited eating in people with diabetes. It has been psychometrically validated. The document discusses interpreting DHP-18 scores and the minimum important differences for each domain.
White paper 5 things you need to know about patient reported outcome (pro) ...Keith Meadows
1) The document discusses key factors to consider when selecting a patient-reported outcome measure (PROM) for use in a study.
2) It is important to select a PROM that is reliable, valid, and measures the specific outcomes of interest related to the disease and treatment.
3) The document provides guidance on the differences between generic and disease-specific PROMs and when each type may be most appropriate. It also discusses important psychometric properties like reliability and validity that should be considered.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive function. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
Oper Semangat: a campaign to gain Indonesian football supporter's optimist sp...Faldi Dwi Wahyudi
The document discusses a campaign to boost support and morale for Indonesia's national football team, which has struggled in recent years. Fans uploaded short videos expressing their enthusiastic support for individual players to address negativity from poor performance. Over 100 videos were compiled into a single video to show widespread continued support and be played at an upcoming match. The football association praised the effort and its goal of demonstrating that many fans still care about and never give up on the national team.
Our story of understanding of what its like living with diabetesKeith Meadows
The Diabetes Health Profile (DHP) is a standardized measure of the psychological and behavioral impact of diabetes. It was initially developed in 1986 through patient interviews and literature reviews to create the 32-item DHP-1 conceptual framework. Between 1996-2000, this was refined into the 18-item DHP-18 framework through analysis of over 2000 patients. The DHP has since been used widely to measure diabetes outcomes and identify groups most at risk of psychological distress, with ongoing development and licensing through DHP Research and Isis Outcomes.
This document provides tips for selecting a patient reported outcome measure (PROM) for research studies. It recommends:
1. Formulating a clear hypothesis about what concept is being measured to identify the appropriate PROM.
2. Ensuring the PROM's content and items are relevant to the target patient population and condition.
3. Considering the PROM's acceptability, including length, time to complete, and easy to understand language and format.
4. Choosing a PROM that has been scientifically developed and validated to reliably measure the intended concept.
5. Understanding how to correctly analyze and interpret the PROM data and scores.
This document discusses using hidden Markov models to automatically discover the structure of clinical forms and annotate them with medical terminology. It presents a two-layer hidden Markov model approach to first assign tags like category and field to form elements, and then group related elements to identify form segments. The method was tested on 52 clinical forms and achieved over 95% accuracy in extracting the underlying structure of the forms in the form of trees. The ability to automatically understand form structure and annotate forms could enable more flexible design of electronic health records.
This document outlines a challenge to increase appreciation of local Indonesian products. It proposes convincing people that supporting local products shows nationalism. Currently, local products have an image of being cheap and low quality. The solution is to give people a unique way to appreciate products by determining their own price. Paying the chosen price would help fund better materials and design research to improve quality. Higher prices would contribute more to quality improvements. Links on price tags would direct people to videos of local craftspeople. Products would be displayed at supermarket cashiers.
Mike Thelwall is a professor known for his research in the field of webometrics. He received his PhD in mathematics and leads the Statistical Cybermetrics Research Group. Webometrics involves the quantitative analysis of web phenomena such as link analysis, search engine evaluation, and web citation analysis. Thelwall's research has explored using webometrics to study the dissemination of scholarly research and evaluate universities. He has emphasized the need for conceptual frameworks and methodologies to interpret webometrics results and address challenges like the size and changing nature of the web.
Clinicians rely on health information technologies (HITs) for clinical data collection, but current HITs are inflexible and inconsistent with clinicians' needs. The researchers propose a flexible electronic health record (fEHR) system to allow clinicians to easily modify the system based on their changing data collection needs. The fEHR uses a form-based interface for clinicians to design forms, generates a corresponding form tree structure, and designs a high-quality database from the tree. A user study with 5 nurses found they could effectively replicate needs in the system and their efficiency and understanding improved over two rounds of tasks of increasing complexity. The researchers conclude the fEHR has potential to reduce HIT problems and that the database design
This document summarizes a study on the remote mentoring program called MAGIC (Get More Active Girls in Computing). MAGIC aims to increase female participation in STEM fields through one-on-one remote mentoring matches between young girls and women professionals in technology careers. The study analyzed data from MAGIC's first 5 years, finding that remote mentoring increased STEM skills, self-confidence, and career awareness for many mentees. However, challenges included maintaining mentor and mentee commitment over time. The study concludes that remote mentoring shows promise for improving gender diversity in STEM, but more data is needed to better understand impacts and how to address challenges.
A brief introduction to SNOMED CT - the ontology based medical terminology. This covers the basic definitions, the difference between SNOMED CT and ICD9, Post co-ordination use-cases and some general information.
This is not an extensive guide for SNOMED CT adoption in a system
This document discusses the origins and development of the LOINC Clinical Document Ontology (CDO), which provides a standardized terminology for clinical document names. It describes how the CDO was created based on empirical analysis of over 2000 local document names. The CDO uses a multi-axial model with domains like subject matter, role, setting, type of service, and kind of document. Iterative evaluations found the expanded CDO better mapped local names than the original. Ongoing work involves adding new content and harmonizing with other clinical terminologies.
Pasi Leino :: Using XML standards for system integrationgeorge.james
The document discusses HL7, an international standard for exchanging healthcare information. Some key points:
- HL7 is a non-profit organization that has been developing standards since 1987 to enable interoperability between healthcare systems.
- The standards cover different levels of interoperability from process to semantic to operational data exchange.
- Common HL7 standards include messages (HL7 v2.x, v3), clinical documents (CDA), terminology (vocabulary bindings), and application programming interfaces.
- HL7 is widely adopted, with a 2000 study finding 80% of large US hospitals using it. Finland has also adopted HL7 standards nationally.
Anne Casey RN MSc FRCN
Editor, Paediatric Nursing
Royal College of Nursing Adviser on Information Standards
Clinical Domain Lead, NHS Information Standards Board for Health and Social Care
(15/10/08, SNOMED Workshop)
An overview of the i2b2 clinical research platform, and the implications of connecting Indivo to i2b2 as a source of patient-reported outcomes. Presented at the 2012 Indivo X Users' Conference.
By Shawn Murphy MD, Ph.D., Partners Healthcare.
SNOMED CT is a clinical terminology used for coding, retrieving, and analyzing health care data. It consists of codes, terms, and relationships that can precisely record and represent clinical information across health care. SNOMED CT concepts are organized into hierarchies and linked through relationships. It aims to enable automated clinical decision support and research by structuring information in a semantically meaningful way.
The document discusses artificial intelligence and pattern recognition. It introduces various pattern recognition concepts including defining a pattern, examples of patterns in different domains, and approaches to pattern recognition. It also provides an example of using discriminative methods to classify fish into salmon and sea bass using optical sensing and extracted features.
This study used representational similarity analysis (RSA) to assess categorical representations in the right fusiform face area (FFA) and parahippocampal place area (PPA) using fMRI. Participants completed a multi-category localizer task and conceptual classification task while brain images were collected. RSA showed that the right FFA robustly differentiated faces from other categories and also distinguished animals. The right PPA robustly differentiated scenes from other categories and also distinguished landmarks. These results demonstrate that RSA can reveal categorical representations in visual cortex.
Visual Analytics for Healthcare - Panel at AMIA 2012 in ChicagoAdam Perer
AMIA 2012 Panel on Visual Analytics for Healthcare
Organizer:
Adam Perer, PhD
Research Scientist
IBM T.J. Watson Research Center, Hawthorne, NY
Panelists:
Ben Shneiderman, PhD
Professor, Computer Science
University of Maryland, College Park, MD
Yuval Shahar, PhD
Professor, Head of the Medical Informatics Research Center
Ben Gurion University, Beer Sheva, Israel
Jeffrey Heer, PhD
Assistant Professor, Computer Science
Stanford University, Stanford, CA
David Gotz, PhD
Research Scientist
IBM T.J. Watson Research Center, Hawthorne, NY
Abstract
With the proliferation of medical information technology, users at all levels of the healthcare system have access to more data than ever before6. This data can be of tremendous value but is often difficult to access and interpret. For example clinicians are often faced with the challenging task of analyzing large amounts of unstructured, multi-modal, and longitudinal data to effectively diagnose and monitor the progression of a patient’s disease4,5. Similarly, patients are confronted with the difficult task of understanding the trends and correlations within data related to their own health. At the institutional level, healthcare organizations are faced with the desire to use data to improve overall operational efficiency and performance, while continuing to maintain the quality of patient care and safety.
Recent advances in visualization and visual analytics have the potential to help each of the user groups listed above do more with the often overwhelming amount of data available to them 1,3,7,8. However, to be successful, visualization designers and clinicians must work together closely to ensure that the right technologies are used to help address the meaningful problems. Unfortunately, despite the continuous use of scientific visualization and visual analytics in medical applications, the lack of communication between engineers and physicians has meant that only basic visualization and analytics techniques are currently employed in clinical practice2,9.
The goal of this panel is to present state-of-the-art visualization applications for healthcare and engage the leading physicians and clinical researchers at AMIA to discuss the areas in healthcare where additional visualization techniques are most needed.
This document summarizes a presentation on the clinical document ontology (CDO) developed by LOINC. It describes the origins and development of having a standardized vocabulary for clinical document names, including empirical analysis of local document names. The presentation reviews the multi-axial model used by LOINC for document names, provides examples, and discusses ongoing evaluation and expansion efforts through collaboration. Future directions include further harmonization of CDO terms and analyzing document content.
A crucial aspect of search applications is the possibility to identify named entities in free-form text and provide functionality for entity-based, complex queries towards the indexed data. By enriching each entity with semantically relevant information acquired from outside the text, one can create the foundation for an advanced search application. Thus, given a document about Denmark, where neither of the words Copenhagen, country, nor capital are mentioned, it should be possible to retrieve the document by querying for Copenhagen or European country.
In this paper, we report how we have tackled this problem. We will, however, concentrate only on the two tasks which are central to the solution, namely named entity recognition (NER) and enrichment of the discovered entities by relying on linked data from knowledge bases such as YAGO2 and DBpedia. We remain agnostic to all other details of the search application, which can be implemented in a relatively straight-forward way by using, e.g. Apache Solr .
The work deals only with Swedish and is restricted to two domains: news articles and medical texts. As a byproduct, our method achieves state-of-the-art results for Swedish NER and to our knowledge there are no previously published works on employing linked data for Swedish for the two domains at hand.
This document discusses ontologies, including what they are, why they are used, and how to create them. Ontologies allow concepts and terms to be standardized and mapped to each other, which facilitates tasks like search, coding, and information retrieval in domains like healthcare. Large medical ontologies like SNOMED-CT help resolve ambiguities and support activities like clinical decision support. Well-designed ontologies are important for representing real-world knowledge in computer systems and managing concepts in health informatics applications.
The document discusses the origins and ongoing development of a document ontology within LOINC and HL7. It describes how the Clinical Document Ontology (CDO) provides consistent semantics for clinical document names to enable interoperability. The CDO uses a multi-axial model with domains like subject matter, role, setting, type of service, and kind of document. Iterative evaluations have helped expand and refine the CDO. Future work includes further harmonization and expanding the model to new document types.
This document discusses ontologies and user needs in publishing. It begins with introductions and definitions of key terms like vocabularies, taxonomies, and ontologies. It then covers semantic markup in publishing and ties these concepts to user needs by focusing on use cases. The document also briefly discusses the semantic web and implications for search by enabling more precise searching through semantic tagging and metadata. Overall, it emphasizes the importance of use cases in driving strategies around ontologies and semantic enrichment of content in publishing.
Clinical Clarity versus Terminological Order - The Readiness of SNOMED CT Con...henryhezhe2003
"Clinical Clarity versus Terminological Order - The Readiness of SNOMED CT Concept Descriptors for Primary Care" presented in CIKM 2012 Workshop MIXHS 12 (the Second International Workshop on Managing Interoperability and Complexity in Health Systems)
SLE 2012 Keynote: Cognitive and Social Challenges of Ontology Use in the Biom...Margaret-Anne Storey
ABSTRACT: Ontologies can provide a conceptualization of a domain leading to a common vocabulary for communities of researchers and important standards to facilitate computation, software interoperability and data reuse. Most successful ontologies, especially those that have been developed by diverse communities over long periods of time, are typically large and complex. To address this complexity, ontology authoring and browsing tools must provide cognitive support to improve comprehension of the many concepts and relationships in ontologies. Also, ontology tools must support collaboration as the heart of ontology design and use is centered on community consensus.
In this talk, I will describe how standardized ontologies are developed and used in the biomedical and clinical domains to aid in scientific and medical discoveries. Specifically, I will present how the US National Center for Biomedical Ontology has designed the BioPortal ontology library (and associated technologies) to promote the use of standardized ontologies and tools. I will review how BioPortal and other ontology tools use established and novel visualization and collaboration approaches to improve ontology authoring and data curation activities. I will also discuss an ambitious project by the World Health Organization that leverages the use of social media to broaden participation in the development of the next version of the International Classification of Diseases. To conclude, I will discuss the challenges and opportunities that arise from using ontologies to bridge communities that manage and curate important information resources.
The document proposes a platform that matches patients from online health communities to relevant medical research projects, by developing rich semantic profiles of both patients and projects. It analyzes patient conversations to extract medical conditions, medications, and demographics to create patient profiles. It also analyzes research project descriptions to create profiles. These profiles are then matched using semantic similarity algorithms to find relevant patients for projects. The platform was prototyped and shown to accurately match patients to projects with similar medical conditions.
This presentation discusses molecular similarity searching methods for drug discovery. It begins with an introduction to cheminformatics and the principle that structurally similar molecules tend to have similar biological properties. The document then covers molecular representations, methods for calculating similarity coefficients between molecules, and a probabilistic model for similarity searching. It proposes a contribution called the Molecular Dynamic Clustering method that uses molecular dynamics simulations and classification algorithms to better assess molecular similarity.
Similar to Exploiting Semantic Structure for Mapping User-specified Form Terms to SNOMED CT Concepts (20)
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfflufftailshop
When it comes to unit testing in the .NET ecosystem, developers have a wide range of options available. Among the most popular choices are NUnit, XUnit, and MSTest. These unit testing frameworks provide essential tools and features to help ensure the quality and reliability of code. However, understanding the differences between these frameworks is crucial for selecting the most suitable one for your projects.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on automated letter generation for Bonterra Impact Management using Google Workspace or Microsoft 365.
Interested in deploying letter generation automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
WeTestAthens: Postman's AI & Automation Techniques
Exploiting Semantic Structure for Mapping User-specified Form Terms to SNOMED CT Concepts
1. Exploiting Semantic Structure for Mapping
User-specified Form Terms
to SNOMED CT Concepts
Ritu Khare1,2, Yuan An1, Jiexun Li1, Il-Yeol Song1, Xiaohua Hu1
The iSchool at Drexel1
College of Medicine2
Drexel University, Philadelphia, PA, USA
2. Presentation Order
1. Motivation
2. Problems
3. Solutions
4. Evaluation
5. Final Remarks
2
3. General Motivation
Database Integration and Interoperability
Semantic Heterogeneity across clinical data sources
(Halevy, 2005, Henry et al. 1993, Hernandez et al. 2005, Wright et al., 1999)
?
MRN Med Rec # Medical Record
Number
Blood Diastolic
Pressure Systolic BP
Physical Status
Constitutional Vital Signs
Recommendation: Controlled Medical Vocabularies should
be involved in the design artifacts of the healthcare systems.
(Jean et al., 2007, Sugumaran and Storey, 2002)
3
4. Specific Motivation
Clinical Encounter Form Electronic Health Records (EHR)
The terms on the clinical forms are mapped to, or annotated
by, a standard terminology.
Domain experts may manually perform the annotation
costly and tedious
Research Objective: Design an automatic tool for mapping
4
form terms to standard terminologies.
5. 1. Motivation
2. Problem
3. Solutions
4. Evaluation
5. Final Remarks
5
6. The Mapping Problem
Clinical Encounter Form SNOMED CT
The Systematized Nomenclature of
Medicine - Clinical Terms (Intl.
Health Terminology Stds. Dev. Org)
Most comprehensive clinical
vocabulary (SNOMED CT User
Guide, 2009).
>360,000 logically-defined clinical
concepts (Hina et al., 2010,
Stenzhorn et al., 2009).
Form
Term SNOMED CT Concept
Patient 11615400: Patient
(person)
MRN
398225001: Medical
record number
6 (observable entity)
8. SNOMED CT Browsers: (Rogers and Bodenreider, 2008)
Existing Mapping Services
General Mapping
Category Specific Mapping
8
9. Challenges:
Mapping Form Terms to SNOMED CT Concepts
Diversity Challenge Context Challenge
Different clinicians - different Same Form Term - Different
terms Concepts.
MRN, Med. Rec.#
Vital signs, Constitutional,
Physical status
9
10. 1. Motivation
2. Problem
3. Solution
4. Evaluation
5. Final Remarks
10
11. Premises
The first, i.e., the most string-
The key is to identify the similar, result retrieved by the
SNOMED CT semantic category-specific mapping is
category appropriate for a usually the desired concept.
given term.
How to automatically determine the SNOMED CT Semantic
? Category appropriate for a given form term ?
11
12. The term context can be derived from the SEMANTIC STUCTURE of
1 the form.
The FORM TREE accurately captures the semantic intentions of
the designer.
Inspired by hierarchical modeling of forms (Dragut et al. 2009,
Wu et al. 2009)
12
13. The implicit relationship between
2 the term context
(i.e., the semantic structure)
and the desired semantic
category Naïve Bayes Classifier
can be formally captured into Based on the Bayes theorem
a STATISTICAL MODEL. (Han and Kamber 2006).
Procedure Class Labels (SNOMED CT
Person
root
semantic categories )
attribute, body structure,
Observable
Entity Patient Examination disorder, …
Data Attributes (local
Name Gender structure)
Respiratory
Observable Node type
Entity
Parent node type
Observable
Child node Type
Entity M F Parent Semantic Category
nl
perc. Grandparent Semantic
Finding Category
Qualifier
Value Qualifier
Value
13
14. Overall Mapping Approach
Form Tree Training Data
Node Category Semantic SNOMED
Form Structure Attributes
Classificatio Membership Category CT
Category SNOMED CT
Term Analyzer n Model Category
Probabilities Picker
Specific
Concept
Mapping
Procedure
Person
root
Observable
Entity Patient Examination
Name Gender Respiratory
Observable
Observable Entity
Entity
Novelty: Hybrid Approach
(leverages semantic structure as well as term
14 linguistics)
15. 1. Motivation
2. Problem
3. Solution
4. Evaluation
5. Final Remarks
15
16. Data Manual (Gold)
Annotations
954 (63.55%) terms
Dataset Forms Total Term Concept ID
Terms
Patien 11615400: Patient
1 Walk in clinic encounter 161 t (person)
forms (3 forms) MRN 398225001: Medical
2 Nursing patient 261 record number
admission forms (6 (observable entity)
forms) … ……………….
3 Labor & delivery DB 294
data-entry forms (7
forms) Some Unmapped Terms
4 Adult visit encounter 388
no scleral icterus
forms
(5 forms) chronic back pain
5 Child visit encounter 397 Follow up with PCP
forms
(5 forms) Sent to ER
16
26 Forms 1501
17. Implementation (JAVA) and Settings
Gold
Form Design Annotations
Interface API, provided by
the Dataline
Form Tree Training Data Software Limited
Category Semantic SNOMED
Form Structure Node Classificatio Membership Category CT
Category SNOMED CT
Term Analyzer Attributes n Model Category
Probabilities Picker
Specific
Concept
Mapping
Cross Validation
17 (leave 1 out) for
each dataset
18. Goal: To study whether…
Experiment Design semantic structure can improve mapping
performance.
SNOMED
Form CT General SNOMED CT Measures
Term Mapping Concept
Precision # correct annotations/#
Baseline (linguistics annotations
only) Recall # correct annotations/# gold
annotations
Category Semantic SNOMED
Form Structure Node Classificatio Membership Category CT SNOMED CT
Category
Term Analyzer Attributes n Model Category
Probabilities Picker
Specific Concept
Mapping
Hybrid (linguistics + semantic
structure)
Category Category
Semantic SNOMED
Form Structure Node Classificatio Membership Category CT SNOMED CT
Picker
Term Analyzer Attributes n Model +candidate Category
Probabilities Specific Concept
set
expansion Mapping
18 Hybrid++
19. Mapping Duration
Results /form = 1- 11 s
Baseline Recall low:
Precision: 0.63, Recall: 0.45 SNOMED CT API uses exact
Baseline to Hybrid string matching
Precision by 18%. Couldn’t handle the variation
of terms, i.e., diversity
Hybrid to Hybrid++ challenge.
Precision by16% , Recall
by23%
Hybrid++
19 Precision: 0.86, Recall: 0.55
20. More Results
Term processing
component
remove special characters
-, #, /, etc.
acronym expansion
dictionary
T (Temperature)
BTL (Bilateral Tubal Precision only slightly
Litigation) improved
3-5%
VTE (Venous
Recall improved majorly
Thromboembolism) 25%
Final Precision =0.89, Recall
20 =0.76
21. Implications
Impact of Semantic Structure
Overall mapping performance
More number of correct predictions (context challenge)
Impact of Linguistics
Majorly on recall
Reaches more number of relevant terms (diversity
challenge)
Overall
Promising performance, even with limited training data
Recall low because of simplicity of linguistic techniques -
can be further improved using sophisticated techniques.
21
22. 1. Motivation
2. Problem
3. Solution
4. Evaluation
5. Final Remarks
22
23. Contributions
PROBLEM: NEW problem of standardizing the terms on clinical
encounter forms using SNOMED CT.
Existing works (Henry et al., 1993, Barrows Jr. et al. 1994,
Patrick et al. 2007)
standardization of clinical notes: diagnosis, medication
information, patient complaints, etc.
SOLUTION: Context-based method that leverages SEMANTIC
STRUCTURE of forms along with term linguistics.
Existing works
linguistic techniques (synonyms, morphemes, lexical
variants)
23
24. Contributions
EVALUATION: 26 healthcare forms containing 950+ mappable
terms specified by multiple clinicians.
Improvement over existing services
23% precision, 38% recall
Promising Performance
precision: 0.89, recall: 0.76
FINDINGS:
Linguistics helps overcome diversity challenge and improve
recall
Semantic structure helps overcome context challenge and
improves precision and recall.
Design synergistic hybrid approaches to address all
mapping challenges, and Achieve a superior performance
24
25. Limitations
TECHNIQUE TECHNICAL EVALUATION
Post coordinated mapping Compare with other models:
Handle Missing and Bayesian networks, k
Inapplicable Values in Neural Networks,
Training data
Classification Association
Rules
STUDY
Test the validity of
Domain Expert Annotator assumptions
Class conditional
independence
Correctness of most
linguistic matching
concept
Classification Attributes
Compare/Combine with
25
other UMLS terminology
26. Future Directions
Fully explore SNOMED In larger frameworks, does
CT annotation help improve
Defining relationships Data/Database Integration
?
Data Quality ?
Customize for Form Patient Diagnosis ?
Categories User Interventions ?
Encounter, Regular
Visit,… Work In Progress:
Larger Knowledge Base for Integrate with flexible Electronic
Training Datasets Health Record system (IHI 2010)
Integration of new forms in EHR
improve database integration
process
26
25 min presentation – 5 min question answer. Make 20 slides only. Read reviewers comments. Breakdown – 2, 4, 5, 5, 4
(In other words, we could say that existing systems are certainly not designed with future integration in mind.)
Who designed the forms? Why not other domains – which other domains? Possible. Have some idea. Mark the concepts – post coordinated or partial mapping.
Draw all the figures properly in MS 2010 ppt.
Why does recall decrease – when number of correct predictions decrease on applying the hybrid method. Sometime linguitic approach returns more accurate result. More improvement in recall, and precision means forms had those terms whose multiple senses exist in SNOMED CT
Our experience of tagging 52 data-entry forms suggests that the training samples can be constructed quickly and easily, as compared to the construction of exhaustive set of rules or heuristicsTo further test the performance of the mapping framework in a heterogeneous environment,