CDSS

A Seminar On
Cloud Deployable Health Data Mining Using
Secured Framework For Clinical Decision Support
System
Guided By: Presented By:
Prof. B. R. Bombade Hanwate Avinash M.
2016MNS013

CONTENTS
 INTRODUCTION
 PREVIOUS RESEARCH
 METHODS
 RESULTS
 PROPOSED SYSTEM
 CONCLUSION
 REFERENCE

INTRODUCTION
 Reliable , scalable, and secured framework is designed.
 Components of Apache Hadoop are used for processing of big data
used for prediction.
 Hadoop clusters are deployed on Google cloud storage.
 Mapreduce based classification via clustering method is proposed for
efficient classification of instances using reduced attributes.

What is CDSS?
 CDSS are developed for the early detection of Heart diseases.
 Medical errors are reduced using CPG’s.
 The five rights considered for implementing successful CDSS: Right
Information, Right People, Right Channel, Right Intervention
Format, Right Time.
 There are two types of CDSS, Knowledge based CDSS and Non-
knowledge based CDSS.

 Interoperability is capability of different systems to exchange
information regarding patients with each other for providing
effective treatment.
Fig. Working of CDSS

Cloud Based CDSS
 Cloud computing platform provide High performance computing.
 Flexible , scalable, and efficient data storage services are offered by
cloud.
 Confidential records of patients stored in clouds must comply with
HITECH standards.

Previous Research
 Used as motivation in present research.
 Knowledge based CDSS was designed using c 4.5 decision tree algorithm
and diseases were predicted with 61.0734% accuracy.
 For analysis of big data over Apache Hadoop platform classification and
clustering methods were used.
 CDSS managed on private cloud was used for storing records of patients
securely.
 The security of records stored on SQL server was ensured using digital
signature.

Methods
 Classification and Classification via clustering approach are used for
prediction of Heart diseases.
 14 attributes are reduced to 3 (cp, ca, thal).
 Steps followed for analysis of datasets are demonstrated as:
Fig. Stages for Analysis of data

Framework used
 Datasets are analyzed on WEKA and Hadoop platform.
 .Arff files are loaded into WEKA explorer for WEKA platform.
 HDFS and Mapreduce are used for Hadoop platform.
 HDFS explorer is used instead of command line prompt for storing
files.
 Map and Reduce functions are generated using Mappers and
Reducers for performing computation over big data effectively.

Proposed Algorithm
 Using this Mapreduce based algorithm decision tree is generated
based on the calculations of entropy and gain ratio of attributes.
 All instances of input files are splitted into index and value.
 output files are generated for every node as intermediate files.
 Intermediate files are merged using combiner and single output rule
file for decision tree is generated.

Results
 Classification using WEKA:
 In classification approach decision trees are generated by
J48 classifier using training set method.
 In classification via clustering approach instances clustered
using k-means are used as input by J48 classifier.

 Classification using Mapreduce:
 Classification of reduced attributes (cp, ca, thal) is done using
Mapreduce based c 4.5 decision tree algorithm.
 The results of classification are compared on the basis of
execution time required for generating decision tree.
 Based on demonstration of WEKA and mapreduce based decision
trees, it is found that mapreduce based decision trees are better.

Hadoop clusters on cloud
 Hadoop clusters are deployed on GCS using GCE with help of GCS
connector.
 Mapreduce jobs are run using GCE with created virtual machine
instances.
 For faster processing input and output files are stored on bucket of
GCS.
Fig. Mapreduce processing using GCE

Proposed System
 In proposed system cloud based framework, security of data stored in
buckets of GCS is enhanced by restricting access to stored data.
 Records of patients are stored in encrypted form in .SEQ files.
 These files can further be encrypted with 256-bit AES scheme using
NppCrypt plugin of Notepad++.

Conclusion
 Framework for mining big data to predict Heart diseases accurately
with reduced attributes is designed.
 Inference rules for building knowledge based CDSS are generated by
traversing nodes of accurate mapreduce based decision trees.
 These trees are generated by classification via clustering approach
and are more accurate as compared to WEKA’s decision trees.
 Combiner is added between Mapper and Reducer to improve the
performance of c 4.5 decision tree algorithm.

References
[1] Ahmed, A. & Hannan, S. A. (Sept. 2012). “Data Mining Techniques
to Find Out Heart Diseases: An Overview”. IJITEE, Vol. 1(4), pp.
18-23.
[2] AL-Gamdi, A. A. et al. (May. 2014). "Clinical Decision Support
System in HealthCare Industry Success and Risk Factors”. IJCTT,
Vol.11(4), pp. 188-192.
[3] Ayma, V. A. et al. (Mar. 2015). “Classification Algorithms for Big
Data Analysis, a Map Reduce Approach”. The International Archives
of the Photogrammetry, Remote Sensing and Spatial Information
Sciences (ISPRS), Vol. XL-3/W2, pp. 17-25.
[4] Kamalraj, N. & Malathi, A. (Nov. 2014). ”Hadoop Operations
Management for Big Data Clusters in Telecommunication Industry”.
International Journal of Computer Applications. Vol. 105(12), pp. 40-
44.

CDSS

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to CDSS

Similar to CDSS (20)

Recently uploaded

Recently uploaded (20)

CDSS