This is the presentation of DMAH workshop in conjunction with VLDB'17. This describes my work during my stay at Emory BMI.
More information: https://kkpradeeban.blogspot.com/2017/08/on-demand-service-based-big-data.html
A podium abstract presented at AMIA 2016 Joint Summits on Translational Science. This discusses Data Café — A Platform For Creating Biomedical Data Lakes.
An update on the latest BioSharing work; including work with ELIXIR and NIH BD2K, also our survey to assess user needs (530 replies) and the work on the recommender tool
A podium abstract presented at AMIA 2016 Joint Summits on Translational Science. This discusses Data Café — A Platform For Creating Biomedical Data Lakes.
An update on the latest BioSharing work; including work with ELIXIR and NIH BD2K, also our survey to assess user needs (530 replies) and the work on the recommender tool
Dear students get fully solved assignments
Send your semester & Specialization name to our mail id :
help.mbaassignments@gmail.com
or
call us at : 08263069601
Research data management (RDM) and the FAIR principles (Findable, Accessible, Interoperable, Reusable) are widely
promoted as basis for a shared research data infrastructure. Nevertheless, researchers involved in next generation
sequencing (NGS) still lack adequate RDM solutions. The NGS metadata is generally not stored together with the raw
NGS data, but kept by individual researchers in separate files. This situation complicates RDM practice. Moreover,
the (meta)data does often not meet the FAIR principles [6]. Consequently, a central FAIR-compliant repository
is highly desirable to support NGS related research. We have selected iRODS (Rule-Oriented Data management
systems) [3] as a basis for implementing a sequencing data repository because it allows storing both data and metadata
together. iRODS serves as scalable middleware to access different storage facilities in a centralized and virtualized
way, and supports different types of clients. This repository will be part of an ecosystem of RDM solutions that
cover complementary phases of the research data life cycle in our organization (Academic Medical Center of the
University of Amsterdam). We selected Virtuoso [5] to enrich the metadata from iRODS to enable the management
of a triplestore for linked data. The metadata in the iCat (iRODS’ metadata catalogue) and the ontology in Virtuoso
are kept synchronized by enforcement of strict data manipulation policies. We have implemented a prototype to
preserve raw sequencing data for one research group. Three iRODS client interfaces are used for different purposes:
Davrods [4] for data and metadata ingestion, data retrieval; Metalnx-web [7] for administration, data curation, and
repository browsing; and iCommands [2] for all tasks by advanced users. Different user profiles are defined (principal
investigator, data curator, repository administrator), with different access rights. New data is ingested by copying raw
sequence files and the corresponding metadata file (a sample sheet) to the landing collection on iRODS. An iRODS
rule is triggered by the sample sheet file, which extracts the metadata and registers it to the iCAT as AVU (Attribute,
Value and Unit). Ontology files are registered into Virtuoso. The sequence files are copied to the persistent collection
and are made uniquely identifiable based on metadata. All the steps are recorded into a report file that enables
monitoring and tracking of progress and faults. Here we describe the design and implementation of the prototype,
and discuss the first assessment results. Initial results indicate that the proposed solution is acceptable and fits the
researchers workflow well.
This is module 11 in the EDI Data Publishing training course. In this module, you will learn the procedure to upload a data package to the EDI Repository.
Introduction to the Environmental Data Initiative (EDI)Corinna Gries
The Environmental Data Initiative enables the environmental science community to maximize knowledge development through the reusability of FAIR environmental data by providing curation services, training, and a robust and modern data repository.
Please cite as: Gries, Corinna. (2018, December). Introduction to the Environmental Data Initiative (EDI) (Version 1.0). Zenodo. http://doi.org/10.5281/zenodo.4672376
Archetype-based data transformation with LinkEHRDavid Moner Cano
How can we convert data to standard data (EN ISO 13606, openEHR, HL7 CDA...) using archetypes? LinkEHR is a tool that helps in achieving this objective.
This presentation was made at the "Arctic Conference on Dual-Model based Clinical Decision Support and Knowledge Management", that took place the 27th and 28th of May, 2014 in Tromsø, Norway.
It is our presentation during CEIT-2016 (Fourth Edition of the International Conference on Control Engineering and Information Technology) held at Hammamet, Tunisia, December 16-18 2016.
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET
Abstract
In this presentation, Susan Gregurick, Ph.D., Associate Director of Data Science and Director, Office of Data Science Strategy at the National Institutes of Health, will share the NIH’s vision for a modernized, integrated FAIR biomedical data ecosystem and the strategic roadmap that NIH is following to achieve this vision. Dr. Gregurick will highlight projects being implemented by team members across the NIH’s 27 institutes and centers and will ways that industry, academia, and other communities can help NIH enable a FAIR data ecosystem. Finally, she will weave in how this strategy is being leveraged to address the COVID-19 pandemic.
Presenter: Susan Gregurick, Ph.D., Associate Director of Data Science and Director, Office of Data Science Strategy at the National Institutes of Health
dkNET Webinar Information: https://dknet.org/about/webinar
Dear students get fully solved assignments
Send your semester & Specialization name to our mail id :
help.mbaassignments@gmail.com
or
call us at : 08263069601
Research data management (RDM) and the FAIR principles (Findable, Accessible, Interoperable, Reusable) are widely
promoted as basis for a shared research data infrastructure. Nevertheless, researchers involved in next generation
sequencing (NGS) still lack adequate RDM solutions. The NGS metadata is generally not stored together with the raw
NGS data, but kept by individual researchers in separate files. This situation complicates RDM practice. Moreover,
the (meta)data does often not meet the FAIR principles [6]. Consequently, a central FAIR-compliant repository
is highly desirable to support NGS related research. We have selected iRODS (Rule-Oriented Data management
systems) [3] as a basis for implementing a sequencing data repository because it allows storing both data and metadata
together. iRODS serves as scalable middleware to access different storage facilities in a centralized and virtualized
way, and supports different types of clients. This repository will be part of an ecosystem of RDM solutions that
cover complementary phases of the research data life cycle in our organization (Academic Medical Center of the
University of Amsterdam). We selected Virtuoso [5] to enrich the metadata from iRODS to enable the management
of a triplestore for linked data. The metadata in the iCat (iRODS’ metadata catalogue) and the ontology in Virtuoso
are kept synchronized by enforcement of strict data manipulation policies. We have implemented a prototype to
preserve raw sequencing data for one research group. Three iRODS client interfaces are used for different purposes:
Davrods [4] for data and metadata ingestion, data retrieval; Metalnx-web [7] for administration, data curation, and
repository browsing; and iCommands [2] for all tasks by advanced users. Different user profiles are defined (principal
investigator, data curator, repository administrator), with different access rights. New data is ingested by copying raw
sequence files and the corresponding metadata file (a sample sheet) to the landing collection on iRODS. An iRODS
rule is triggered by the sample sheet file, which extracts the metadata and registers it to the iCAT as AVU (Attribute,
Value and Unit). Ontology files are registered into Virtuoso. The sequence files are copied to the persistent collection
and are made uniquely identifiable based on metadata. All the steps are recorded into a report file that enables
monitoring and tracking of progress and faults. Here we describe the design and implementation of the prototype,
and discuss the first assessment results. Initial results indicate that the proposed solution is acceptable and fits the
researchers workflow well.
This is module 11 in the EDI Data Publishing training course. In this module, you will learn the procedure to upload a data package to the EDI Repository.
Introduction to the Environmental Data Initiative (EDI)Corinna Gries
The Environmental Data Initiative enables the environmental science community to maximize knowledge development through the reusability of FAIR environmental data by providing curation services, training, and a robust and modern data repository.
Please cite as: Gries, Corinna. (2018, December). Introduction to the Environmental Data Initiative (EDI) (Version 1.0). Zenodo. http://doi.org/10.5281/zenodo.4672376
Archetype-based data transformation with LinkEHRDavid Moner Cano
How can we convert data to standard data (EN ISO 13606, openEHR, HL7 CDA...) using archetypes? LinkEHR is a tool that helps in achieving this objective.
This presentation was made at the "Arctic Conference on Dual-Model based Clinical Decision Support and Knowledge Management", that took place the 27th and 28th of May, 2014 in Tromsø, Norway.
It is our presentation during CEIT-2016 (Fourth Edition of the International Conference on Control Engineering and Information Technology) held at Hammamet, Tunisia, December 16-18 2016.
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET
Abstract
In this presentation, Susan Gregurick, Ph.D., Associate Director of Data Science and Director, Office of Data Science Strategy at the National Institutes of Health, will share the NIH’s vision for a modernized, integrated FAIR biomedical data ecosystem and the strategic roadmap that NIH is following to achieve this vision. Dr. Gregurick will highlight projects being implemented by team members across the NIH’s 27 institutes and centers and will ways that industry, academia, and other communities can help NIH enable a FAIR data ecosystem. Finally, she will weave in how this strategy is being leveraged to address the COVID-19 pandemic.
Presenter: Susan Gregurick, Ph.D., Associate Director of Data Science and Director, Office of Data Science Strategy at the National Institutes of Health
dkNET Webinar Information: https://dknet.org/about/webinar
Clinical Data Models - The Hyve - Bio IT World April 2019Kees van Bochove
Population genetics and genomics is an emerging topic for the application of machine learning methods in healthcare and biomedical sciences. Currently, several large genomics initiatives, such as Genomics England, UK Biobank, the All of Us Project, and Europe's 1 Million Genomes Initiative are all in the process of making both clinical and genomics data available from large numbers of patients to benefit biomedical research. However, a key challenge in these initiatives is the standardization of the clinical and outcomes data in such a way that machine learning methods can be effectively trained to discover useful medical and scientific insights. In this talk, we will look at what data is available at scale, and review some of examples of the application of common data and evidence models such as OMOP, FHIR, GA4GH etc. in order to achieve this, based on projects which The Hyve has executed with some of these initiatives to harmonize their clinical, genomics, imaging and wearables data and make it FAIR.
Agencies such as the NSF and NIH require data management plans as part of research proposals and the Office of Science and Technology Policy (OSTP) is requiring federal agencies to develop plans to increase public access to results of federally funded scientific research. These slides explore sustainable data sharing models, including models for sharing restricted-use data. Demos of these models and tips for accessing public data access services are provided as well as resources for creating data management plans for grant applications.
This is module 4 in the EDI Data Publishing training course. In this module, you will learn how to group your data files and other information products into a publishable unit.
Data Harmonization for a Molecularly Driven Health SystemWarren Kibbe
Maximizing the value of data, computing, data science in an academic medical center, or 'towards a molecularly informed Learning Health System. Given in October at the University of Florida in Gainesville
An overview of big data in clinical research. Discussion of big data related to real world evidence (RWE), wearable sensor data (IoT), and clinical genomics. Introduces the use of map-reduce infrastructure for big data in biomedicine.
Google Summer of Code (GSoC) is a remote open-source internship program funded by Google, for contributors to remotely work with an open source organization (and get paid) over a summer.
https://kkpradeeban.blogspot.com/2022/11/google-summer-of-code-gsoc-2023.html
GSoC 2022 comes with more changes and flexibility. This presentation aims to give an introduction to the contributors and what to expect this summer.
https://kkpradeeban.blogspot.com/2022/01/google-summer-of-code-gsoc-2022.html
GSoC 2022 comes with more changes and flexibility. This presentation aims to give an introduction to the contributors and what to expect this summer.
https://kkpradeeban.blogspot.com/2022/01/google-summer-of-code-gsoc-2022.html
Niffler is an efficient DICOM Framework for machine learning pipelines and processing workflows on metadata. It facilitates efficient transfer of DICOM images on-demand and real-time from PACS to the research environments, to run processing workflows and machine learning pipelines.
https://github.com/Emory-HITI/Niffler/
This is an introductory presentation to GSoC 2021. This year there were a few specific changes to GSoC compared to the past years. Specifically, workload and the student stipend have been made half in 2021 compared to the previous years.
We propose Niffler (https://github.com/Emory-HITI/Niffler), an open-source ML framework that runs in research
clusters by receiving images in real-time using DICOM protocol from hospitals' PACS.
This presentation aims to introduce GSoC to new mentors and mentoring organizations. More details - https://kkpradeeban.blogspot.com/2019/12/google-summer-of-code-gsoc-2020-for.html
An introductory presentation to Google Summer of Code (GSoC), focusing on the year 2020. More information can be found at https://kkpradeeban.blogspot.com/search/label/GSoC
The diversity of data management systems affords developers the luxury of building heterogeneous architectures to address the unique needs of big data. It allows one to mix-n-match systems that can store, query, update, and process data based on specific use cases. However, this heterogeneity brings
with it the burden of developing custom interfaces for each data management system. Existing big data frameworks fall short in mitigating these challenges imposed. In this paper, we present Bindaas, a secure and extensible big data middleware that offers uniform access to diverse data sources. By providing a RESTful web service interface to the data sources, Bindaas exposes query, update, store, and delete functionality of the data sources as data service APIs, while providing turn-key support for standard operations involving access control and audit-trails. The research community has deployed Bindaas in
various production environments in healthcare. Our evaluations highlight the efficiency of Bindaas in serving concurrent requests to data source instances with minimal overheads.
This is the 2nd defense of my Ph.D. double degree.
More details - https://kkpradeeban.blogspot.com/2019/08/my-phd-defense-software-defined-systems.html
The presentation slides of my Ph.D. thesis. For more information - https://kkpradeeban.blogspot.com/2019/07/my-phd-defense-software-defined-systems.html
The presentation slides of my Ph.D. thesis proposal ("CAT" as known in my university). I received a score of 18/20.
Supervisors:
Prof. Luís Veiga (IST, ULisboa)
Prof. Peter Van Roy (UCLouvain)
Jury:
Prof. Javid Taheri (Karlstad University)
Prof. Fernando Mira da Silva (IST, ULisboa)
This is my presentation at IFIP Networking 2018 in Zurich.
In this paper, we propose a cloud-assisted network as an alternative connectivity provider.
More details: https://kkpradeeban.blogspot.com/2018/05/moving-bits-with-fleet-of-shared.html
Services that access or process a large volume of data are known as data services. Big data frameworks consist of diverse storage media and heterogeneous data formats. Through their service-based approach, data services offer a standardized execution model to big data frameworks. Software-Defined Networking (SDN) increases the programmability of the network, by unifying the control plane centrally, away from the distributed data plane devices. In this paper, we present Software-Defined Data Services (SDDS), extending the data services with the SDN paradigm. SDDS consists of two aspects. First, it models the big data executions as data services or big services composed of several data services. Then, it orchestrates the services centrally in an interoperable manner, by logically separating the executions from the storage. We present the design of an SDDS orchestration framework for network-aware big data executions in data centers. We then evaluate the performance of SDDS through microbenchmarks on a prototype implementation. By extending SDN beyond data centers, we can deploy SDDS in broader execution environments.
https://kkpradeeban.blogspot.com/2018/04/software-defined-data-services.html
This is a poster I presented at ACRO Summer School at Karlstad University. This presents my PhD work.
More details: http://kkpradeeban.blogspot.com/2017/07/my-first-polygonal-journey.html
This is the presentation I did to the audience of EMJD-DC Spring Event 2017 Brussels to discuss my research. http://kkpradeeban.blogspot.be/2017/05/emjd-dc-spring-event-2017.html
These simplified slides by Dr. Sidra Arshad present an overview of the non-respiratory functions of the respiratory tract.
Learning objectives:
1. Enlist the non-respiratory functions of the respiratory tract
2. Briefly explain how these functions are carried out
3. Discuss the significance of dead space
4. Differentiate between minute ventilation and alveolar ventilation
5. Describe the cough and sneeze reflexes
Study Resources:
1. Chapter 39, Guyton and Hall Textbook of Medical Physiology, 14th edition
2. Chapter 34, Ganong’s Review of Medical Physiology, 26th edition
3. Chapter 17, Human Physiology by Lauralee Sherwood, 9th edition
4. Non-respiratory functions of the lungs https://academic.oup.com/bjaed/article/13/3/98/278874
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...kevinkariuki227
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Verified Chapters 1 - 19, Complete Newest Version.pdf
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Verified Chapters 1 - 19, Complete Newest Version.pdf
Prix Galien International 2024 Forum ProgramLevi Shapiro
June 20, 2024, Prix Galien International and Jerusalem Ethics Forum in ROME. Detailed agenda including panels:
- ADVANCES IN CARDIOLOGY: A NEW PARADIGM IS COMING
- WOMEN’S HEALTH: FERTILITY PRESERVATION
- WHAT’S NEW IN THE TREATMENT OF INFECTIOUS,
ONCOLOGICAL AND INFLAMMATORY SKIN DISEASES?
- ARTIFICIAL INTELLIGENCE AND ETHICS
- GENE THERAPY
- BEYOND BORDERS: GLOBAL INITIATIVES FOR DEMOCRATIZING LIFE SCIENCE TECHNOLOGIES AND PROMOTING ACCESS TO HEALTHCARE
- ETHICAL CHALLENGES IN LIFE SCIENCES
- Prix Galien International Awards Ceremony
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdfAnujkumaranit
Artificial intelligence (AI) refers to the simulation of human intelligence processes by machines, especially computer systems. It encompasses tasks such as learning, reasoning, problem-solving, perception, and language understanding. AI technologies are revolutionizing various fields, from healthcare to finance, by enabling machines to perform tasks that typically require human intelligence.
These lecture slides, by Dr Sidra Arshad, offer a quick overview of physiological basis of a normal electrocardiogram.
Learning objectives:
1. Define an electrocardiogram (ECG) and electrocardiography
2. Describe how dipoles generated by the heart produce the waveforms of the ECG
3. Describe the components of a normal electrocardiogram of a typical bipolar leads (limb II)
4. Differentiate between intervals and segments
5. Enlist some common indications for obtaining an ECG
Study Resources:
1. Chapter 11, Guyton and Hall Textbook of Medical Physiology, 14th edition
2. Chapter 9, Human Physiology - From Cells to Systems, Lauralee Sherwood, 9th edition
3. Chapter 29, Ganong’s Review of Medical Physiology, 26th edition
4. Electrocardiogram, StatPearls - https://www.ncbi.nlm.nih.gov/books/NBK549803/
5. ECG in Medical Practice by ABM Abdullah, 4th edition
6. ECG Basics, http://www.nataliescasebook.com/tag/e-c-g-basics
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...i3 Health
i3 Health is pleased to make the speaker slides from this activity available for use as a non-accredited self-study or teaching resource.
This slide deck presented by Dr. Kami Maddocks, Professor-Clinical in the Division of Hematology and
Associate Division Director for Ambulatory Operations
The Ohio State University Comprehensive Cancer Center, will provide insight into new directions in targeted therapeutic approaches for older adults with mantle cell lymphoma.
STATEMENT OF NEED
Mantle cell lymphoma (MCL) is a rare, aggressive B-cell non-Hodgkin lymphoma (NHL) accounting for 5% to 7% of all lymphomas. Its prognosis ranges from indolent disease that does not require treatment for years to very aggressive disease, which is associated with poor survival (Silkenstedt et al, 2021). Typically, MCL is diagnosed at advanced stage and in older patients who cannot tolerate intensive therapy (NCCN, 2022). Although recent advances have slightly increased remission rates, recurrence and relapse remain very common, leading to a median overall survival between 3 and 6 years (LLS, 2021). Though there are several effective options, progress is still needed towards establishing an accepted frontline approach for MCL (Castellino et al, 2022). Treatment selection and management of MCL are complicated by the heterogeneity of prognosis, advanced age and comorbidities of patients, and lack of an established standard approach for treatment, making it vital that clinicians be familiar with the latest research and advances in this area. In this activity chaired by Michael Wang, MD, Professor in the Department of Lymphoma & Myeloma at MD Anderson Cancer Center, expert faculty will discuss prognostic factors informing treatment, the promising results of recent trials in new therapeutic approaches, and the implications of treatment resistance in therapeutic selection for MCL.
Target Audience
Hematology/oncology fellows, attending faculty, and other health care professionals involved in the treatment of patients with mantle cell lymphoma (MCL).
Learning Objectives
1.) Identify clinical and biological prognostic factors that can guide treatment decision making for older adults with MCL
2.) Evaluate emerging data on targeted therapeutic approaches for treatment-naive and relapsed/refractory MCL and their applicability to older adults
3.) Assess mechanisms of resistance to targeted therapies for MCL and their implications for treatment selection
Explore natural remedies for syphilis treatment in Singapore. Discover alternative therapies, herbal remedies, and lifestyle changes that may complement conventional treatments. Learn about holistic approaches to managing syphilis symptoms and supporting overall health.
Factory Supply Best Quality Pmk Oil CAS 28578–16–7 PMK Powder in Stockrebeccabio
Factory Supply Best Quality Pmk Oil CAS 28578–16–7 PMK Powder in Stock
Telegram: bmksupplier
signal: +85264872720
threema: TUD4A6YC
You can contact me on Telegram or Threema
Communicate promptly and reply
Free of customs clearance, Double Clearance 100% pass delivery to USA, Canada, Spain, Germany, Netherland, Poland, Italy, Sweden, UK, Czech Republic, Australia, Mexico, Russia, Ukraine, Kazakhstan.Door to door service
Hot Selling Organic intermediates
Title: Sense of Taste
Presenter: Dr. Faiza, Assistant Professor of Physiology
Qualifications:
MBBS (Best Graduate, AIMC Lahore)
FCPS Physiology
ICMT, CHPE, DHPE (STMU)
MPH (GC University, Faisalabad)
MBA (Virtual University of Pakistan)
Learning Objectives:
Describe the structure and function of taste buds.
Describe the relationship between the taste threshold and taste index of common substances.
Explain the chemical basis and signal transduction of taste perception for each type of primary taste sensation.
Recognize different abnormalities of taste perception and their causes.
Key Topics:
Significance of Taste Sensation:
Differentiation between pleasant and harmful food
Influence on behavior
Selection of food based on metabolic needs
Receptors of Taste:
Taste buds on the tongue
Influence of sense of smell, texture of food, and pain stimulation (e.g., by pepper)
Primary and Secondary Taste Sensations:
Primary taste sensations: Sweet, Sour, Salty, Bitter, Umami
Chemical basis and signal transduction mechanisms for each taste
Taste Threshold and Index:
Taste threshold values for Sweet (sucrose), Salty (NaCl), Sour (HCl), and Bitter (Quinine)
Taste index relationship: Inversely proportional to taste threshold
Taste Blindness:
Inability to taste certain substances, particularly thiourea compounds
Example: Phenylthiocarbamide
Structure and Function of Taste Buds:
Composition: Epithelial cells, Sustentacular/Supporting cells, Taste cells, Basal cells
Features: Taste pores, Taste hairs/microvilli, and Taste nerve fibers
Location of Taste Buds:
Found in papillae of the tongue (Fungiform, Circumvallate, Foliate)
Also present on the palate, tonsillar pillars, epiglottis, and proximal esophagus
Mechanism of Taste Stimulation:
Interaction of taste substances with receptors on microvilli
Signal transduction pathways for Umami, Sweet, Bitter, Sour, and Salty tastes
Taste Sensitivity and Adaptation:
Decrease in sensitivity with age
Rapid adaptation of taste sensation
Role of Saliva in Taste:
Dissolution of tastants to reach receptors
Washing away the stimulus
Taste Preferences and Aversions:
Mechanisms behind taste preference and aversion
Influence of receptors and neural pathways
Impact of Sensory Nerve Damage:
Degeneration of taste buds if the sensory nerve fiber is cut
Abnormalities of Taste Detection:
Conditions: Ageusia, Hypogeusia, Dysgeusia (parageusia)
Causes: Nerve damage, neurological disorders, infections, poor oral hygiene, adverse drug effects, deficiencies, aging, tobacco use, altered neurotransmitter levels
Neurotransmitters and Taste Threshold:
Effects of serotonin (5-HT) and norepinephrine (NE) on taste sensitivity
Supertasters:
25% of the population with heightened sensitivity to taste, especially bitterness
Increased number of fungiform papillae
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists Saeid Safari
Preoperative Management of Patients on GLP-1 Receptor Agonists like Ozempic and Semiglutide
ASA GUIDELINE
NYSORA Guideline
2 Case Reports of Gastric Ultrasound
Anti ulcer drugs and their Advance pharmacology ||
Anti-ulcer drugs are medications used to prevent and treat ulcers in the stomach and upper part of the small intestine (duodenal ulcers). These ulcers are often caused by an imbalance between stomach acid and the mucosal lining, which protects the stomach lining.
||Scope: Overview of various classes of anti-ulcer drugs, their mechanisms of action, indications, side effects, and clinical considerations.
On-Demand Service-Based Big Data Integration: Optimized for Research Collaboration
1. 1/23
Pradeeban Kathiravelu1,2
, Yiru Chen3
, Ashish Sharma4
,
Helena Galhardas1
, Peter Van Roy2
, Luís Veiga1
On-Demand Service-Based
Big Data Integration:
Optimized for Research Collaboration
The 3rd
International Workshop on Data Management and Analytics for Medicine and Healthcare (DMAH),
in conjunction with the 43rd International Conference on Very Large Data Bases.
Munich, Germany. September 1, 2017.
1
INESC-ID / Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal
2
Université catholique de Louvain, Louvain-la-Neuve, Belgium
3
Peking University, Beijing, China
4
Department of Biomedical Informatics, Emory University, Atlanta, USA
2. 2/23
Introduction
●
Scale and diversity of big data are rising.
–
Geographically distributed data of exabytes.
–
Structured, semi-structured, unstructured, or ill-formed data.
●
Integration of data is crucial for data science.
●
Sharing of integrated data and results.
–
Mandatory for reproducible research.
3. 3/23
Challenges in Medical Research
for Big Data Integration
●
Multiple types of data.
–
Imaging, clinical, and genomic.
●
Numerous data sources.
–
No shared messaging protocol.
●
Do we really need to integrate all the data?
4. 4/23
A Story of Medical Data Researchers...A Story of Medical Data Researchers...
5. 5/23
●
Jim is interested in the
effects of a medicine to
treat brain tumor in patients
of certain age groups.
6. 6/23
Observation - 1
●
Various sources.
–
Service-based data access through APIs.
●
Thanks to specifications such as HL7 FHIR.
●
The researchers possess domain knowledge.
●
Integrate On-Demand.
–
Avoid eager loading of binary data or its textual metadata.
–
Use the researcher query as an input in loading data.
●
Scalable storage in-house.
–
Potential to load, integrate, index, and query unstructured data.
10. 10/23
Observation - 3
●
Do not duplicate data!
–
We ``own`` our interest; not the data.
●
Point to the data in the data sources.
–
Pointers to data like Dropbox Shared Links work well.
●
Avoids outdated duplicate data.
●
Easy to maintain.
●
APIs – Access the list of research data sets.
11. 11/23
Problems
●
How to..
–
Load data from several service-based big data sources.
●
Avoid duplicate downloads and near duplicate data.
–
Integrate disparate data and persist for future accesses.
–
Share pointers to data internally and externally.
12. 12/23
Óbidos
OOn-demand BBig Data IIntegration,
DDistribution, and OOrchestration SSystem
●
Researcher query →
Narrow down the search space.
●
Define subsets of data that are
of interest.
–
Exploiting the well-defined
hierarchical structure of medical data.
●
Medical Images (DICOM)
●
Clinical data
●
..
13. 13/23
Óbidos Approach
●
Hybrid of virtual and materialized data integration
approaches.
–
Lazy load of metadata: Load the matching subset of metadata.
–
Store integrated data and query results → scalable storage.
●
Track already loaded data.
–
Near duplicate detection.
–
Download only updates (changesets).
●
Efficient SQL queries on NoSQL storage.
●
Share pointers to the datasets rather than the dataset itself.
●
Generic design; implementation for medical research data.Generic design; implementation for medical research data.
15. 15/23
Evaluation
●
Evaluation Data:
–
Clinical data and DICOM imaging collections of TCIA.
●
Benchmark Óbidos against eager and lazy ETL.
–
Performance of loading and querying data.
●
Óbidos (inter- and intra- organization) against binary data sharing.
–
Space/bandwidth efficiency of data sharing.
17. 17/23
Data load time
Change in total data volume (Same query and same interest)
●
Observation:
–
Load time increases for eager and lazy ETL with total volume.
–
Load time for Óbidos remains constant.
●
Total volume of data is irrelevant for Óbidos.
18. 18/23
Change in studies of interest
(Same query and constant total data volume)
Data load time
●
Observation:
–
Load time for eager and lazy ETL remains constant.
–
Load time increases for Óbidos with the interest.
●
Converges to the load time of lazy ETL.
19. 19/23
Query completion time
for the integrated data repository
●
Observation:
–
We assume the corresponding data is already loaded.
●
Thus, lazy and eager ETL perform similar.
–
Indexed scalable NoSQL architecture of Óbidos → Better performance.
20. 20/23
Efficiency in Sharing Medical Research Data
●
Observation:
–
A constant-size UID is sufficient, intra-organization.
–
With number of series, Óbidos pointers grow, inter-organization.
–
Traditional binary data sharing:
shared data size = volume of the image series.
21. 21/23
Conclusion
●
Óbidos offers on-demand service-based big data integration.
–
Fast and resource-efficient data analysis.
–
SQL queries over NoSQL data store for the integrated data.
–
Efficient data sharing without duplicating actual data.
●
Future Work
–
Consume data from repositories of domains beyond medical data.
●
EUDAT
–
Óbidos distributed virtual data warehouses.
●
Leverage the proximity of the organizations in data integration and sharing.
22. 22/23
Acknowledgements
●
NCI QIN grant (1U01CA187013, Resources for
Development and Validation Of Radiomic Analyses and
Adaptive Therapy).
●
Google Summer of Code (2014, 2015, and 2016).
●
The Cancer Imaging Archive (TCIA).
●
Tyk and API Umbrella Teams.
23. 23/23
Conclusion
●
Óbidos offers on-demand service-based big data integration.
–
Fast and resource-efficient data analysis.
–
SQL queries over NoSQL data store for the integrated data.
–
Efficient data sharing without duplicating actual data.
●
Future Work
–
Consume data from repositories of domains beyond medical data.
●
EUDAT
–
Óbidos distributed virtual data warehouses.
●
Leverage the proximity of the organizations in data integration and sharing.
Thank you!
Questions?