A technical seminar on
Submitted for fulfillment of the 1st semester Examination of M.tech degree in
Bioinformatics
at
R.V. College of Engineering
Bangalore
Under the guidance of
Dr. Vidya Niranjan
By,
Amshumala S
1st year M.tech Bioinformatics
R.V..C.E
16 December 2016 1HEALTHCARE INFORMATICS, RVCE
16 December 2016 2HEALTHCARE INFORMATICS, RVCE
1. INTRODUCTION
2. ADVANTAGES AND DISCIPLINES INVOLVED
3. BIG DATA IN HEALTHCARE
4. CHALLENGES
5. SOURCES OF DATA
6. BIG DATA ARCHITECTURE
7. TOOLS AND SOFTWARES IN BIG DATA
8. CASE STUDY
16 December 2016 HEALTHCARE INFORMATICS, RVCE 3
• HEALTHCARE INFORMATICS is a multidisciplinary field that uses health information
technology (HIT) to improve health care via any combination of higher quality, higher
efficiency (spurring lower cost and thus greater availability), and new opportunities.
• Health information technology (HIT) is information technology applied
to health and health care. It supports health information management across
computerized systems and the secure exchange of health information
between consumers, providers, payers, and quality monitors.
• The interdisciplinary study of the design, development, adoption and application
of IT-based innovations in healthcare services delivery, management and planning.
• Informatics refers to the science of information, the practice of information
processing, and the engineering of information systems. Informatics underlies the
academic investigation and practitioner application of computing and
communications technology to healthcare, health education, and biomedical
research.
16 December 2016 HEALTHCARE INFORMATICS, RVCE 4
Main advantages of HIT are :
•Improve health care quality or effectiveness
•Increase health care productivity or efficiency
•Prevent medical errors and increase health care accuracy and procedural correctness
•Reduce health care costs
•Increase administrative efficiencies and healthcare work processes
•Decrease paperwork and unproductive or idle work time
•Extend real-time communications of health informatics among health care professionals
•Expand access to affordable care.
Public health benefits :
• Early detection of infectious disease outbreaks around the country;
• Improved tracking of chronic disease management; and
• Evaluation of health care based on value enabled by the collection of de-identified price and
quality information that can be compared.
16 December 2016 HEALTHCARE INFORMATICS, RVCE 5
Disciplines
involved
Information
technology
Behavioral
Science
Social
Science
Computer
Science
Biology
Life
Sciences
Management
Science
Medical
sciences
Clinical BT/
Pharmaceut
icals
16 December 2016 HEALTHCARE INFORMATICS, RVCE 6
CLINICAL
PATHOLOGICAL
PUBLIC HEALTH
PHARMACEAUTICAL
COMMUNITY HEALTH’
CLINICAL
PUBLIC HEALTH
NURSING
HEALTHCARE
INFORMATICS
16 December 2016 HEALTHCARE INFORMATICS, RVCE 7
Brief Introduction On BIG Data Analytics
16 December 2016 HEALTHCARE INFORMATICS, RVCE 8
Reasons for growing Complexity in healthcare data
•Standard medical practice is moving from relatively ad-hoc and subjective decision making to
evidence-based healthcare.
• More incentives to professionals/hospitals to use EHR technology.
Additional Data Sources
• Development of new technologies such as capturing devices, sensors, and mobile applications.
• Collection of genomic information became cheaper.
• Patient social communications in digital forms are increasing.
• More medical knowledge/discoveries are being accumulated.
16 December 2016 HEALTHCARE INFORMATICS, RVCE 9
• Inferring knowledge from complex heterogeneous patient sources. Leveraging the
patient/data correlations in longitudinal records.
• Understanding unstructured clinical notes in the right context.
• Efficiently handling large volumes of medical imaging data and extracting potentially
useful information and biomarkers.
• Analyzing genomic data is a computationally intensive task and combining with
standard clinical data adds additional layers of complexity.
• Capturing the patient’s behavioral data through several sensors; their various social
interactions and communications.
Big data challenges
16 December 2016 HEALTHCARE INFORMATICS, RVCE 10
Sun Pharma*
16 December 2016 HEALTHCARE INFORMATICS, RVCE 11
Conceptual architecture in big data architecture
Raghupathi, Wullianallur, and Viju Raghupathi. “Big Data Analytics in
Healthcare: Promise and Potential.” Health Information Science and
Systems 2 (2014): 3. PMC. Web. 23 Apr. 2016.
16 December 2016 HEALTHCARE INFORMATICS, RVCE 12
Sources and data types include:
1. Web and social media data: Clickstream and interaction data from Facebook, Twitter,
LinkedIn, blogs, and the like. It can also include health plan websites, smartphone apps, etc.
2. Machine to machine data: readings from remote sensors, meters, and other vital sign
devices.
3. Big transaction data: health care claims and other billing records increasingly available in
semi-structured and unstructured formats
4. Biometric data: finger prints, genetics, handwriting, retinal scans, x-ray and other medical
images, blood pressure, pulse and pulse-oximetry readings, and other similar types of data.
5. Human-generated data: unstructured and semi-structured data such as EMRs, physicians
notes, email, and paper documents
16 December 2016 HEALTHCARE INFORMATICS, RVCE 13
Platforms & tools for big data analytics in healthcare
The Hadoop Distributed File System (HDFS) : HDFS enables the underlying
storage for the Hadoop cluster. It divides the data into smaller parts and
distributes it across the various servers/nodes.
MapReduce : MapReduce provides the interface for the distribution of sub-tasks
and the gathering of outputs. When tasks are executed, MapReduce tracks the
processing of each server/node.
PIG and PIG Latin
(Pig and PigLatin) Pig programming language is configured to assimilate all types
of data (structured/ unstructured, etc.). It is comprised of two key modules: the
language itself, called
PigLatin, and the runtime version in which the PigLatin code is executed.
Hive : Hive is a runtime Hadoop support architecture that leverages Structure
Query Language (SQL) with the Hadoop platform. It permits SQL programmers to
develop Hive Query Language (HQL) statements akin to typical SQL statements.
Jaql : Jaql is a functional, declarative query language designed to process large
data sets. To facilitate parallel processing, Jaql converts “‘high-level’ queries into
‘low-level’ queries” consisting of MapReduce tasks.
16 December 2016 HEALTHCARE INFORMATICS, RVCE 14
Zookeeper: Zookeeper allows a centralized infrastructure with various services,
providing synchronization across a cluster of servers. Big data analytics applications
utilize these services to coordinate parallel processing across big clusters.
Hbase: HBase is a column-oriented database management system that sits on top of HDFS. It
uses a non-SQL approach.
Cassandra : Cassandra is also a distributed database system. It is designated as a top-level
project modeled to handle big data distributed across many utility servers. It also provides
reliable service with no particular point of failure and it is a NoSQL system.
Oozie Oozie, an open source project, streamlines the workflow and coordination among the
tasks.
Lucene : The Lucene project is used widely for text analytics/searches and has been
incorporated into several open source projects. Its scope includes full text indexing
and library search for use within a Java application.
Avro : Avro facilitates data serialization services. Versioning and version control are
additional useful features.
Mahout Mahout is yet another Apache project whose goal is to generate free applications
of distributed and scalable machine learning algorithms that support big data analytics on the
Hadoop platform.
16 December 2016 HEALTHCARE INFORMATICS, RVCE 15
Outline of big data analytics in healthcare methodology
• Concept
statement
• Establish
need for big
data
analytics
project in
healthcare
based on the
“4Vs”.
STEP 1
• Proposal
• What is the
problem
being
addressed?
• Why is it
important
and
interesting?
• Why big data
analytics
approach?
• Background
material
STEP 2
• Methodology
• Propositions
• Variable selection
• Data collection
• ETL and data
transformation
• Platform/tool
selection
• Conceptual model
• Analytic techniques
• Association,
clustering,
classification, etc.
• Results & insight
STEP 3
• Deployment
• • Evaluation
& validation
• • Testing
STEP 4
16 December 2016 HEALTHCARE INFORMATICS, RVCE 16
The Effectiveness of Mobile-Health Technology-Based Health Behaviour Change or
Disease Management Interventions for Health Care Consumers: A Systematic Review
EXAMPLE
Background: Mobile technologies could be a powerful media for providing individual level
support to health care consumers. We conducted a systematic review to assess the
effectiveness of mobile technology interventions delivered to health care consumers.
16 December 2016 HEALTHCARE INFORMATICS, RVCE 17
Methods and Findings:
All controlled trials of mobile technology-based health interventions delivered to health care
consumers using MEDLINE, EMBASE, PsycINFO, Global Health, Web of Science, Cochrane Library,
UK NHS HTA (Jan 1990–Sept 2010).
Two authors extracted data on allocation concealment, allocation sequence, blinding,
completeness of follow-up, and measures of effect. We calculated effect estimates and used
random effects meta-analysis. We identified 75 trials.
Fifty-nine trials investigated the use of mobile technologies to improve disease management and
26 trials investigated their use to change health behaviors.
Nearly all trials were conducted in high-income countries. Trials had a low risk of
bias.
Two trials of disease management had low risk of bias; in one, antiretroviral (ART) adherence, use
of text messages reduced high viral load (.400 copies), with a relative risk (RR) of 0.85 (95% CI
0.72–0.99), but no statistically significant benefit on mortality (RR 0.79 [95% CI 0.47–1.32]).
16 December 2016 HEALTHCARE INFORMATICS, RVCE 18
In a second, a PDA based intervention increased scores for perceived self care agency
in lung transplant patients. Two trials of health behaviour management had low risk of
bias.
The pooled effect of text messaging smoking cessation support on biochemically
verified smoking cessation was (RR 2.16 [95% CI 1.77–2.62]).
Interventions for other conditions showed suggestive benefits in some cases, but the
results were not consistent.
No evidence of publication bias was demonstrated on visual or statistical examination
of the funnel plots for either disease management or health behaviors.
16 December 2016 HEALTHCARE INFORMATICS, RVCE 19
REFERENCES
1. Raghupathi, Wullianallur, and Viju Raghupathi. “Big Data Analytics in Healthcare: Promise
and Potential.” Health Information Science and Systems 2 (2014): 3. PMC. Web. 23 Apr. 2016.
2. LaValle S, Lesser E, Shockley R, Hopkins MS, Kruschwitz N: Big data, analytics and the path
from insights to value. MIT Sloan Manag Rev 2011, 52:20–32.
3. IHTT: Transforming Health Care through Big Data Strategies for leveraging big data in the
health care industry; 2013. http://ihealthtran.com/ wordpress/2013/03/iht%C2%B2-
releases-big-data-research-reportdownload-today/.
4. Free C, Phillips G, Galli L, Watson L, Felix L, Edwards P, et al. (2013) The Effectiveness of
Mobile-Health Technology-Based Health Behaviour Change or Disease Management
Interventions for Health Care Consumers: A Systematic Review. PLoS Med 10(1): e1001362.
doi:10.1371/journal.pmed.1001362. January 15, 2013

HEALTHCARE_IT

  • 1.
    A technical seminaron Submitted for fulfillment of the 1st semester Examination of M.tech degree in Bioinformatics at R.V. College of Engineering Bangalore Under the guidance of Dr. Vidya Niranjan By, Amshumala S 1st year M.tech Bioinformatics R.V..C.E 16 December 2016 1HEALTHCARE INFORMATICS, RVCE
  • 2.
    16 December 20162HEALTHCARE INFORMATICS, RVCE 1. INTRODUCTION 2. ADVANTAGES AND DISCIPLINES INVOLVED 3. BIG DATA IN HEALTHCARE 4. CHALLENGES 5. SOURCES OF DATA 6. BIG DATA ARCHITECTURE 7. TOOLS AND SOFTWARES IN BIG DATA 8. CASE STUDY
  • 3.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 3 • HEALTHCARE INFORMATICS is a multidisciplinary field that uses health information technology (HIT) to improve health care via any combination of higher quality, higher efficiency (spurring lower cost and thus greater availability), and new opportunities. • Health information technology (HIT) is information technology applied to health and health care. It supports health information management across computerized systems and the secure exchange of health information between consumers, providers, payers, and quality monitors. • The interdisciplinary study of the design, development, adoption and application of IT-based innovations in healthcare services delivery, management and planning. • Informatics refers to the science of information, the practice of information processing, and the engineering of information systems. Informatics underlies the academic investigation and practitioner application of computing and communications technology to healthcare, health education, and biomedical research.
  • 4.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 4 Main advantages of HIT are : •Improve health care quality or effectiveness •Increase health care productivity or efficiency •Prevent medical errors and increase health care accuracy and procedural correctness •Reduce health care costs •Increase administrative efficiencies and healthcare work processes •Decrease paperwork and unproductive or idle work time •Extend real-time communications of health informatics among health care professionals •Expand access to affordable care. Public health benefits : • Early detection of infectious disease outbreaks around the country; • Improved tracking of chronic disease management; and • Evaluation of health care based on value enabled by the collection of de-identified price and quality information that can be compared.
  • 5.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 5 Disciplines involved Information technology Behavioral Science Social Science Computer Science Biology Life Sciences Management Science Medical sciences Clinical BT/ Pharmaceut icals
  • 6.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 6 CLINICAL PATHOLOGICAL PUBLIC HEALTH PHARMACEAUTICAL COMMUNITY HEALTH’ CLINICAL PUBLIC HEALTH NURSING HEALTHCARE INFORMATICS
  • 7.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 7 Brief Introduction On BIG Data Analytics
  • 8.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 8 Reasons for growing Complexity in healthcare data •Standard medical practice is moving from relatively ad-hoc and subjective decision making to evidence-based healthcare. • More incentives to professionals/hospitals to use EHR technology. Additional Data Sources • Development of new technologies such as capturing devices, sensors, and mobile applications. • Collection of genomic information became cheaper. • Patient social communications in digital forms are increasing. • More medical knowledge/discoveries are being accumulated.
  • 9.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 9 • Inferring knowledge from complex heterogeneous patient sources. Leveraging the patient/data correlations in longitudinal records. • Understanding unstructured clinical notes in the right context. • Efficiently handling large volumes of medical imaging data and extracting potentially useful information and biomarkers. • Analyzing genomic data is a computationally intensive task and combining with standard clinical data adds additional layers of complexity. • Capturing the patient’s behavioral data through several sensors; their various social interactions and communications. Big data challenges
  • 10.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 10 Sun Pharma*
  • 11.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 11 Conceptual architecture in big data architecture Raghupathi, Wullianallur, and Viju Raghupathi. “Big Data Analytics in Healthcare: Promise and Potential.” Health Information Science and Systems 2 (2014): 3. PMC. Web. 23 Apr. 2016.
  • 12.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 12 Sources and data types include: 1. Web and social media data: Clickstream and interaction data from Facebook, Twitter, LinkedIn, blogs, and the like. It can also include health plan websites, smartphone apps, etc. 2. Machine to machine data: readings from remote sensors, meters, and other vital sign devices. 3. Big transaction data: health care claims and other billing records increasingly available in semi-structured and unstructured formats 4. Biometric data: finger prints, genetics, handwriting, retinal scans, x-ray and other medical images, blood pressure, pulse and pulse-oximetry readings, and other similar types of data. 5. Human-generated data: unstructured and semi-structured data such as EMRs, physicians notes, email, and paper documents
  • 13.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 13 Platforms & tools for big data analytics in healthcare The Hadoop Distributed File System (HDFS) : HDFS enables the underlying storage for the Hadoop cluster. It divides the data into smaller parts and distributes it across the various servers/nodes. MapReduce : MapReduce provides the interface for the distribution of sub-tasks and the gathering of outputs. When tasks are executed, MapReduce tracks the processing of each server/node. PIG and PIG Latin (Pig and PigLatin) Pig programming language is configured to assimilate all types of data (structured/ unstructured, etc.). It is comprised of two key modules: the language itself, called PigLatin, and the runtime version in which the PigLatin code is executed. Hive : Hive is a runtime Hadoop support architecture that leverages Structure Query Language (SQL) with the Hadoop platform. It permits SQL programmers to develop Hive Query Language (HQL) statements akin to typical SQL statements. Jaql : Jaql is a functional, declarative query language designed to process large data sets. To facilitate parallel processing, Jaql converts “‘high-level’ queries into ‘low-level’ queries” consisting of MapReduce tasks.
  • 14.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 14 Zookeeper: Zookeeper allows a centralized infrastructure with various services, providing synchronization across a cluster of servers. Big data analytics applications utilize these services to coordinate parallel processing across big clusters. Hbase: HBase is a column-oriented database management system that sits on top of HDFS. It uses a non-SQL approach. Cassandra : Cassandra is also a distributed database system. It is designated as a top-level project modeled to handle big data distributed across many utility servers. It also provides reliable service with no particular point of failure and it is a NoSQL system. Oozie Oozie, an open source project, streamlines the workflow and coordination among the tasks. Lucene : The Lucene project is used widely for text analytics/searches and has been incorporated into several open source projects. Its scope includes full text indexing and library search for use within a Java application. Avro : Avro facilitates data serialization services. Versioning and version control are additional useful features. Mahout Mahout is yet another Apache project whose goal is to generate free applications of distributed and scalable machine learning algorithms that support big data analytics on the Hadoop platform.
  • 15.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 15 Outline of big data analytics in healthcare methodology • Concept statement • Establish need for big data analytics project in healthcare based on the “4Vs”. STEP 1 • Proposal • What is the problem being addressed? • Why is it important and interesting? • Why big data analytics approach? • Background material STEP 2 • Methodology • Propositions • Variable selection • Data collection • ETL and data transformation • Platform/tool selection • Conceptual model • Analytic techniques • Association, clustering, classification, etc. • Results & insight STEP 3 • Deployment • • Evaluation & validation • • Testing STEP 4
  • 16.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 16 The Effectiveness of Mobile-Health Technology-Based Health Behaviour Change or Disease Management Interventions for Health Care Consumers: A Systematic Review EXAMPLE Background: Mobile technologies could be a powerful media for providing individual level support to health care consumers. We conducted a systematic review to assess the effectiveness of mobile technology interventions delivered to health care consumers.
  • 17.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 17 Methods and Findings: All controlled trials of mobile technology-based health interventions delivered to health care consumers using MEDLINE, EMBASE, PsycINFO, Global Health, Web of Science, Cochrane Library, UK NHS HTA (Jan 1990–Sept 2010). Two authors extracted data on allocation concealment, allocation sequence, blinding, completeness of follow-up, and measures of effect. We calculated effect estimates and used random effects meta-analysis. We identified 75 trials. Fifty-nine trials investigated the use of mobile technologies to improve disease management and 26 trials investigated their use to change health behaviors. Nearly all trials were conducted in high-income countries. Trials had a low risk of bias. Two trials of disease management had low risk of bias; in one, antiretroviral (ART) adherence, use of text messages reduced high viral load (.400 copies), with a relative risk (RR) of 0.85 (95% CI 0.72–0.99), but no statistically significant benefit on mortality (RR 0.79 [95% CI 0.47–1.32]).
  • 18.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 18 In a second, a PDA based intervention increased scores for perceived self care agency in lung transplant patients. Two trials of health behaviour management had low risk of bias. The pooled effect of text messaging smoking cessation support on biochemically verified smoking cessation was (RR 2.16 [95% CI 1.77–2.62]). Interventions for other conditions showed suggestive benefits in some cases, but the results were not consistent. No evidence of publication bias was demonstrated on visual or statistical examination of the funnel plots for either disease management or health behaviors.
  • 19.
    16 December 2016HEALTHCARE INFORMATICS, RVCE 19 REFERENCES 1. Raghupathi, Wullianallur, and Viju Raghupathi. “Big Data Analytics in Healthcare: Promise and Potential.” Health Information Science and Systems 2 (2014): 3. PMC. Web. 23 Apr. 2016. 2. LaValle S, Lesser E, Shockley R, Hopkins MS, Kruschwitz N: Big data, analytics and the path from insights to value. MIT Sloan Manag Rev 2011, 52:20–32. 3. IHTT: Transforming Health Care through Big Data Strategies for leveraging big data in the health care industry; 2013. http://ihealthtran.com/ wordpress/2013/03/iht%C2%B2- releases-big-data-research-reportdownload-today/. 4. Free C, Phillips G, Galli L, Watson L, Felix L, Edwards P, et al. (2013) The Effectiveness of Mobile-Health Technology-Based Health Behaviour Change or Disease Management Interventions for Health Care Consumers: A Systematic Review. PLoS Med 10(1): e1001362. doi:10.1371/journal.pmed.1001362. January 15, 2013