SlideShare a Scribd company logo
1 of 14
Download to read offline
Table Of Contents
Abstract Title: MMIS - Big Data Integration ................................................................................3
Author : Rajasekaran Kandhasamy.........................................................................................3
Overview of Paper: ...................................................................................................................................3
Introduction: .........................................................................................................................................3
Note: .................................................................................................................................................3
Functional Specification:...........................................................................................................................3
Primary Use Cases:................................................................................................................................3
1) Unified File Management (UFM) - Heterogeneous Storage Solutions (HSS) [or] Cloud Data
Center (CDC): ....................................................................................................................................3
2) Unified Claims Archiving (UAC).................................................................................................4
3) Extract, Transform and Load (ETL) Integration.........................................................................4
4) Process Large Audit or Log Files................................................................................................5
Secondary Use Cases: ...........................................................................................................................5
5) Near - Continuous data protection (CDP) or Backup & Recovery.............................................5
Value to Payers: ........................................................................................................................................6
Technical Specification:.............................................................................................................................7
High Level Architecture:........................................................................................................................7
1) Unified File Management (UFM) - Heterogeneous Storage Solutions (HSS) [or] Cloud Data
Center (CDC) Flow:................................................................................................................................8
2) Unified Claims Archiving (UAC):....................................................................................................9
HBase Claim Archival Sample Data Model:.....................................................................................10
3) Extract, Transform and Load (ETL) Integration:..........................................................................11
Proposed system:............................................................................................................................11
Design Option 1:..............................................................................................................................11
Design Option 2:..............................................................................................................................11
4) Process Large Audit or Log Files:.................................................................................................12
Security Audit .............................................................................................................................12
Application Audit ............................................................................................................................12
5) Near - Continuous data protection (CDP) Backup & Recovery:..................................................13
Proposed system:............................................................................................................................13
Backup:............................................................................................................................................13
Recovery: ........................................................................................................................................14
Hive Claim Backup Sample Data Model:.........................................................................................14
Abstract Title: MMIS - Big Data Integration
Author : Rajasekaran Kandhasamy (krajasekaranmca@gmail.com)
Overview of Paper:
Introduction: MMIS/HealthCare Payer Applications depend upon traditional data base models and
structured data analytics to fulfill their needs. These approaches, while adequate in the past, will not
suffice to address future requirements. They lack the processing capability to load and query multi-
terabyte datasets in a timely fashion and the flexibility to effectively manage unstructured and semi-
structured data. Adapti g Big Data platfo to MMI“ appli atio ill esol e a o e issues.
This technical paper provides details about integrating MMIS/HealthCare payer
appli atio s ith Hadoop ased Big Data platfo .
Note: MMIS is big application and hence paper covers use cases only related to claims.
The proposal is not a replacement for OLTP database approach and just an idea what benefits we can
get if we integrate with Hadoop technologies. Another Non MMIS application covered here is MMIS
BI/BIRT/JASPERSOFT is nothing but business intelligence analytical tools with chart/report capabilities.
Simply open source BI or reporting tool to manage big data activities.
Functional Specification:
Primary Use Cases:
1) Unified File Management (UFM) - Heterogeneous Storage Solutions (HSS) [or]
Cloud Data Center (CDC): is a multi-source data collection/management platform that
delivers backup, archive, search, and analytics capabilities to Medicaid or HealthCare
Payer applications. Introducing cloud based options for data that must be kept for
extremely long periods of time. Simply, consolidate MMIS file transfer processes under
one managed solution.
CDC provides connectivity for the flow of data in the form of files between providers, state
agencies and switch vendors and Enterprise System.
Within the context of MMIS, A CDC/HSS is one which describes an interaction
between external entities, system or service agency (e.g. Switch Vendor, Provider Agency)
with the MMIS/Payer applications. This interaction could involve transfer of data and
store the data. Most of these external interfaces will be file based inbound and outbound
interfaces which come in batches. External systems will be exchanging the data with the
MMIS in different formats. Each data file may contain one or more records. The possible
file formats are:
a. X12 files
b. XML files
c. Flat files (Delimited and Fixed width data, comma separated values)
d. Binary data
Advantages:
 While the majority of files in most payer applications are stored on IT-managed
file servers that are not always under the direct control of IT. Here we maintain
all in one umbrella.
 Reduce li e si g, ai te a e a d suppo t osts of file se e s. De elop a o
vendor lock-i f a e o k.
 Scalable storage (Hadoop) environment with CDC a solutio that does ’t
require a change in platforms or the retraining of IT employees and
administrators. CDC delivers comprehensive capabilities more efficiently than
ad hoc data/file management systems do, allowing an enterprise to dedicate
fewer resources to supporting infrastructure than to innovating, so it can
quickly bring its innovation to market.
 Reduce MMIS operational costs with respect to data storage, backup, archive
and maintenance.
 Minimize analytical latency in big data like environment.
 External systems can connect to UFM/CDC/HSS using any FTP/SFTP client or
REST based Resource Oriented Architecture (ROA).
2) Unified Claims Archiving (UAC): Preserve information for compliance, legal, business
reference, or system optimization purposes. Archiving capabilities that MEDICAID may not
believe they need now but, given current archive market trends, will be extremely useful
to them in the near future. The combination of increasing manual and machine-generated
data and increasingly larger file/message/database sizes. For a variety of reasons, a good
portion of this data needs to be archived. Once a claim case was closed it became the
responsibility of the MMIS/HealthCare Payer to archive and manage the closed file in
compliance with regulations.
Advantages:
 Data associated with claims processing is a good candidate for data archival. If
the size of a production table gets too large there will be a distinct impact to
retrieval time. Most of the core system screens include limits on the number of
records the screen will retrieve and display. When tables are large, screens will
not display all of the applicable data and some screens will not function.
 Moving old data from MMIS OLTP to Hadoop based HBase can increase MMIS
OLTP performance. More number of unwanted/old unused records in claims
table would decrease the performance.
 Historical information for comparative and competitive analysis.
 Enhanced data quality and completeness.
 Supplementing disaster recovery plans with another data backup source.
3) Extract, Transform and Load (ETL) Integration: Most ETL software packages require
their own servers, processing, databases, and licenses. They also require setup,
configuration, and development by experts in that particular tool, and those skills are not
always transferable.
Advantages:
 For instance MMIS reference sub module receives different set of procedure,
diagnosis and other codes as files from CMS periodically. These files are stored
in CDC and Hadoop based Pig/Hive used to convert this file data into MMIS
understandable format with less expensive manner.
Note: Most of the existing MMIS uses Pl/SQL based ETL for loading data into
MMIS DB. If the MMIs don’t want to break existing flow then use CDC adapter to
get file from Hadoop instead of traditional file server.
 UFP/ CDC can easily be stored inexpensively in the cloud and processed by Hive
to ETL data. It is a cost-effective complement to data warehouse solutions, and
it reduces risk, cost, and/or improves accessibility over in-house solutions. Once
data is processed and stored in Hive it does make sense to consider the various
file formats available.
4) Process Large Audit or Log Files: Audits are historical and immutable. We can
segregate MMIS audits in two categories.
a) Security Audit: MMIS application keep logging user actions in the form MMIS
file and this file will be moved to Hadoop based HBase NoSql DB. Through
MMIS BI application user can view who logged in, what actions he performed
and so on.
b) Application Audit: Existing MMIS have following application audit options,
1) DB Triggers,
2) Module specific code will insert for each user operation. E.g. Error codes
view history.
New proposal uses JMS or queue: A scalable approach, if you really need it,
and one that is completely in line with the J2EE specification, is to use JMS.
That is, publish your audit log messages to a message queue, and another,
separate process (Flume), can take them off the queue and log them either in
Cloud Data Center (CDC) based HBase NoSQL database.
Secondary Use Cases:
5) Near - Continuous data protection (CDP) or Backup & Recovery: Near-continuous
data protection (near CDP) is a general term for backup and recovery products that take
backup snapshots at set intervals. CDP technology protects data on a nearly continuous
basis. Rather than running a large monolithic backup overnight, CDP products back up
data every few minutes, 24 hours a day.
Advantages:
 N-CDP is a Hadoop-based backup solution that efficiently and cost-effectively
protects business-critical healthcare data such as databases, and files.
 By default Hadoop enables near-instant recovery from disasters and other
replication features.
 By providing continuous and periodic protection, N-CDP allows organizations to
enhance or eliminate their tape-backup infrastructures, minimizing software
license and maintenance fees as well as hardware and tape costs.
 Recovery Point Objective (RPO) refers to the point in time in the past to which
you will recover.
 Recovery Time Objective (RTO) refers to the point in time in the future at which
you will be up and running again.
Difference between CDP and N-CDP:
 CDP backup the data for every action on data. But N-CDP take backup on user
defined regular interval.
Value to Payers:
 Proudly say Payers is in cloud and big data market.
 The cloud based data centers can subscribe by other parties/state/payer with agreed
SLAs. So there is no separate data centers maintenance required for each state or payer.
 Reduce licensing, maintenance and support costs. Go with a o e do lo k-i
framework. Wherever possible avoid licensing software run along with MMIS and go with
open source proposed tools. For e.g,
a) Informatica - Use CDC based Hadoop ETL tools.
b) FTP Server - Use CDC.
c) COGNOS or Other BI tools - Use MMIS BI/BIRT/JASPERSOFT based open source
analytical tool.
d) Archive and backup tool - Use proposed approach.
 By developing more operational and analytical related use cases with this integration
will move Payers into business intelligence tool market.
 SaaS/multi-tenant enabled MMIS BI application can use by several customers with low
infrastructure cost maintenance.
 Much and more big data advantages.
Technical Specification:
High Level Architecture:
Maryland MMIS Tenant (E.g.)
SFTP
over
Hadoop
Providers
Agencies
Others
Inbound Landing Zone
Outbound Landing Zone
Claims/Reference/TPL
Member
Provider
Others
D
a
t
a
M
a
r
t
HBase/ Hive Flume Pig/Sqoop Oozie Tools/YARN
EHR/Cognos/OthersMMIS BI/BIRT/JASPERSOFT DB MMIS/ HealthCare Payer DB
1) Unified File Management (UFM) - Heterogeneous Storage Solutions (HSS) [or] Cloud
Data Center (CDC) Flow:
Cloud Data Center:
i. External clients can upload files to their dedicated inbound directory through
FTP/SFTP.
ii. Here we use apache mina based customized SFTP to support Hadoop file
system.
iii. Once the files are placed, the MMIS listening queues pick file from HDFS and
start claim processing as per the above flow.
iv. Also MMIS BI to be capable of REST enabled service to upload files.
MMIS BI Unified File Management:
i. UFM is one of the sub module in MMIS BI application.
ii. Using SaaS MMIS BI, user can view complete inbound and outbound file details
under single point of access for the particular tenant.
iii. Different kind of charts/metrics used to monitor day to day file activities in CDC.
Note: Software As A Service (SaaS) MMIS BI name depicts that the application is tenant
aware. So same application services can be used by other subscribers.
EDI Claim Flow
Paper Claim Flow
Reference File Loading
Emdeon/OCR
SFTP over
Hadoop
EDI Claims
Paper Claims
Reference Files
Claims/Ref FTP/SFTP Clients
CDC -HDFS
Img Archival
PL/SQL
Hippa Validation
Claims Loader
Hippa Translation
Loading Process
MMIS DB
Claim OCR Data
SaaS - MMIS BI Unified File Management
File Monitoring
Ref File Read & Load
2) Unified Claims Archiving (UAC):
MMIS APP:
 Through MMIS application user can perform different type of archival as per above
diagram.
 Once archival initiated, Sqoop module will trigger. This Sqoop module load detail from
MMIS Db to CDC based HBase db. Claim HBase data structure depicts in below diagram.
MMIS BI:
 Through MMIS BI application user can view different type of claims related charts. This
will read data from HBase NoSQL DB.
 O e est e a ple is Ope atio al Met i s , he e use a ie paid lai s th ough
certain period of the time.
 Also appli atio suppo t A al ti al Met i s , he e use a ie lai fo e ast fo
certain period of time.
MMIS DB
Cloud Data Center - Hadoop
HBase - NoSQL Database
Claims Data Mart
Archive Table Backup Table Other Tables
MMIS BI
Apache Sqoop
MMIS APP
Provider type based archival
Claim type based archival
Date wise archival
Claim status based archival
Quarterly archival
Yearly archival
Claims Archive Operational Metrics (PAST)
Claims Archive Analytical Metrics (FUTURE)
HBase Claim Archival Sample Data Model:
 Around 50 - 80 tables are involved in claim adjudication related process. The below
section depicts mapping OLTP claim data model to HBase based NoSQL data model.
 HBase currently does not do well with anything above two or three column families so in
this design we have one column family for all header related tables and one for claim
line related tables.
 I elo diag a all heade elated ta le e t ies go i to HEADER FAMILY a d li e
ite elated ta le e t ies go to LINE FMAILY .
 “<ChildTableName_RecordNumber_ColumnName> is the generic format to insert the
values. This is nothing but mapping OLTP one to many to NoSQL tables. E.g: OLTP claim
Header cutback table entries go here as CUTBACK_1_QLFR= CUTBACK (Table name),
1(Record Number), QLFR (Column name).
 Row - Key: <claimfiledate_claimtype_providername>
HBASE DATABASE
CLM_ARC_TB
HEADER FAMILY
TCN
CUTBACK_1_QLFR
TPL_1_AMT
LINE FAMILY
ATTACHMENT_1_NAME
PRVDR_1_LCTN
PROCEDURE_1_CODE
3) Extract, Transform and Load (ETL) Integration: Taxonomy codes, HCPCS, Correct Coding
Initiative (CCI), Diagnosis Related Group Codes (DRG), Medicare Physician Fee Schedule (MPFS),
ICD‑10, Clinical Lab Fee Schedule codes are the few interface reference files where payer will
receive from CMS/State/Others. All are claim reference codes to adjudicate the claims and these
needs to update periodically in MMIS DB.
Proposed system:
Design Option 1:
 CMS/State/Others can place the reference files in CDC.
 MMIS DB procedures pick the files from CDC and start loading the file content
into MMIS DB.
Design Option 2:
 CMS/State/Others can place the reference files in CDC.
 Apache Pig application is the ETL transaction model that describes how a
process will extract data from a CDC, transform it according to a rule set and
then load it into Apache Hive.
 Apache Sqoop loads the details from Apache Hive to MMIS DB.
CMS/State/Others
MMIS DB
Cloud Data Center - Hadoop
HIVE
Reference Data Mart
Reference Table
Apache Sqoop
Apache Pig - ETL
HDFS
EXTRACT
TRANSFORM
LOAD
4) Process Large Audit or Log Files:
 Whenever user logged into the system MMIS start capturing user page actions in file
format.
 Apache flume listening to this file and whenever row added this information moved to
HBase DB.
 User can use BI tool view the details in allowed formats.
Security Audit
Application Audit
Security Log File
MMIS APP
Cloud Data Center - Hadoop
HBASE
SECUITY_AUDIT Table
Apache Flume
Live Streaming
All User Actions
MMIS BI
Live Data where can see logged
in user actions
JMS Queue
MMIS APP
Cloud Data Center - Hadoop
HBASE
APPLICATION_AUDIT Table
Apache Flume
Application audit
All User Modifications
MMIS BI
Live Data where can see logged
in user modifications
5) Near - Continuous data protection (CDP) Backup & Recovery:
Proposed system:
Backup:
Design Option 1: One time full load and subsequent update based on time stamp.
 MMIS BI integration module triggers backup service at every one hour.
 Backup service calls java sqoop client with claim tables as parameter.
 One time activity: Sqoop get connect with MMIS DB and start to import
complete table data. Its start with header table and subsequent child table will
get load iteratively.
 Sqoop support alternate table update strategy supported is called lastmodified
mode. So when rows of the source table (MMIS table) may be updated, and
each such update will set the value of a last-modified column to the current
timestamp. Only those records get update in Hive side and new records will be
inserted as usual in Hive.
Design Option 2: Complete MMIS snapshot every time.
 MMIS BI integration module triggers backup service at every one hour.
 Backup service calls java sqoop client with claim tables as parameter.
 Every time sqoop import complete data from MMIS DB. Header table first and
child tables next.
 Each primary key appended with job trigger time and this will be used in
recovery time.
Cloud Data Center - Hadoop
MMIS DB
…
HIVE
Claims Header Claims Line Claims Other Tables
Sqoop Import Sqoop Export
Backup
MMIS BI
Backup
ESB
BackupService
Recovery
UI -Enter Recovery Point
RecoveryService
Load data from mentioned time
Regular Interval
Recovery
Recovery:
 If e hoose Desig Optio e o e ti e is ot e ui ed. Be ause this is
exact MMIS DB copy.
 If e hoose Desig Optio the e o e ti e is list of trigger time from
MMIS BI context. So user can choose any one of the time and snap shot data
obtained during that time will get load from cloud to MMIS DB.
 If i ase a disaste happe s to MMIs DB, e a use Re o e odule to
load data from cloud to MMIs.
 Sqoop will export the data from Hive table to MMIS tables
Hive Claim Backup Sample Data Model:
 There are no differences in MMIS data model and HIVE data model.
 Mo e o less all a e sa e fo Desig Optio .
 Fo Desig Optio all p i a ke s will be appended with additional
timestamp surrogate key.

More Related Content

What's hot

Master data management (mdm) & plm in context of enterprise product management
Master data management (mdm) & plm in context of enterprise product managementMaster data management (mdm) & plm in context of enterprise product management
Master data management (mdm) & plm in context of enterprise product managementTata Consultancy Services
 
Introduction to the Query-driven Approach
Introduction to the Query-driven ApproachIntroduction to the Query-driven Approach
Introduction to the Query-driven ApproachTimothy Valihora
 
Master data management
Master data managementMaster data management
Master data managementZahra Mansoori
 
Master Data Management - Gartner Presentation
Master Data Management - Gartner PresentationMaster Data Management - Gartner Presentation
Master Data Management - Gartner Presentation303Computing
 
Introduction to the Update-driven Approach
Introduction to the Update-driven ApproachIntroduction to the Update-driven Approach
Introduction to the Update-driven ApproachTimothy Valihora
 
Master data management and data warehousing
Master data management and data warehousingMaster data management and data warehousing
Master data management and data warehousingZahra Mansoori
 
Lean Master Data Management
Lean Master Data ManagementLean Master Data Management
Lean Master Data Managementnnorthrup
 
White Paper - Data Warehouse Governance
White Paper -  Data Warehouse GovernanceWhite Paper -  Data Warehouse Governance
White Paper - Data Warehouse GovernanceDavid Walker
 
MDM Strategy & Roadmap
MDM Strategy & RoadmapMDM Strategy & Roadmap
MDM Strategy & Roadmapvictorlbrown
 
Master Data Management methodology
Master Data Management methodologyMaster Data Management methodology
Master Data Management methodologyDatabase Architechs
 
Whitepaper on Master Data Management
Whitepaper on Master Data Management Whitepaper on Master Data Management
Whitepaper on Master Data Management Jagruti Dwibedi ITIL
 
Is Your Data Ready to Drive Your Company's Future?
Is Your Data Ready to Drive Your Company's Future?Is Your Data Ready to Drive Your Company's Future?
Is Your Data Ready to Drive Your Company's Future?Edgewater
 
Master Data Management: Extracting Value from Your Most Important Intangible ...
Master Data Management: Extracting Value from Your Most Important Intangible ...Master Data Management: Extracting Value from Your Most Important Intangible ...
Master Data Management: Extracting Value from Your Most Important Intangible ...FindWhitePapers
 
Data - the Oil & Gas asset that isn’t managed like one
Data  - the Oil & Gas asset that isn’t managed like oneData  - the Oil & Gas asset that isn’t managed like one
Data - the Oil & Gas asset that isn’t managed like oneMolten2013
 
EIM Presentation 2016
EIM Presentation 2016EIM Presentation 2016
EIM Presentation 2016John Bao Vuu
 
Understanding Reference Data with Aaron Zornes
Understanding Reference Data with Aaron ZornesUnderstanding Reference Data with Aaron Zornes
Understanding Reference Data with Aaron ZornesOrchestra Networks
 

What's hot (20)

Reference Data Management
Reference Data Management Reference Data Management
Reference Data Management
 
Master data management (mdm) & plm in context of enterprise product management
Master data management (mdm) & plm in context of enterprise product managementMaster data management (mdm) & plm in context of enterprise product management
Master data management (mdm) & plm in context of enterprise product management
 
Introduction to the Query-driven Approach
Introduction to the Query-driven ApproachIntroduction to the Query-driven Approach
Introduction to the Query-driven Approach
 
Master data management
Master data managementMaster data management
Master data management
 
Master Data Management - Gartner Presentation
Master Data Management - Gartner PresentationMaster Data Management - Gartner Presentation
Master Data Management - Gartner Presentation
 
Introduction to the Update-driven Approach
Introduction to the Update-driven ApproachIntroduction to the Update-driven Approach
Introduction to the Update-driven Approach
 
Master data management and data warehousing
Master data management and data warehousingMaster data management and data warehousing
Master data management and data warehousing
 
Lean Master Data Management
Lean Master Data ManagementLean Master Data Management
Lean Master Data Management
 
Multidomain MDM at Amadeus
Multidomain MDM at AmadeusMultidomain MDM at Amadeus
Multidomain MDM at Amadeus
 
White Paper - Data Warehouse Governance
White Paper -  Data Warehouse GovernanceWhite Paper -  Data Warehouse Governance
White Paper - Data Warehouse Governance
 
MDM Strategy & Roadmap
MDM Strategy & RoadmapMDM Strategy & Roadmap
MDM Strategy & Roadmap
 
Master Data Management methodology
Master Data Management methodologyMaster Data Management methodology
Master Data Management methodology
 
Data Governance for Enterprises
Data Governance for EnterprisesData Governance for Enterprises
Data Governance for Enterprises
 
Data Flux
Data FluxData Flux
Data Flux
 
Whitepaper on Master Data Management
Whitepaper on Master Data Management Whitepaper on Master Data Management
Whitepaper on Master Data Management
 
Is Your Data Ready to Drive Your Company's Future?
Is Your Data Ready to Drive Your Company's Future?Is Your Data Ready to Drive Your Company's Future?
Is Your Data Ready to Drive Your Company's Future?
 
Master Data Management: Extracting Value from Your Most Important Intangible ...
Master Data Management: Extracting Value from Your Most Important Intangible ...Master Data Management: Extracting Value from Your Most Important Intangible ...
Master Data Management: Extracting Value from Your Most Important Intangible ...
 
Data - the Oil & Gas asset that isn’t managed like one
Data  - the Oil & Gas asset that isn’t managed like oneData  - the Oil & Gas asset that isn’t managed like one
Data - the Oil & Gas asset that isn’t managed like one
 
EIM Presentation 2016
EIM Presentation 2016EIM Presentation 2016
EIM Presentation 2016
 
Understanding Reference Data with Aaron Zornes
Understanding Reference Data with Aaron ZornesUnderstanding Reference Data with Aaron Zornes
Understanding Reference Data with Aaron Zornes
 

Viewers also liked

Healthcare Payer and Provider Webinar
Healthcare Payer and Provider WebinarHealthcare Payer and Provider Webinar
Healthcare Payer and Provider WebinarInformatica Cloud
 
Healthcare payer medical informatics and analytics
Healthcare payer medical informatics and analyticsHealthcare payer medical informatics and analytics
Healthcare payer medical informatics and analyticsFrank Wang
 
837 preparation for testing
837 preparation for testing837 preparation for testing
837 preparation for testinghaigvk
 
Business process modelling and e tom telecom
Business process modelling and e tom telecomBusiness process modelling and e tom telecom
Business process modelling and e tom telecomKate Koltunova
 
HEALTH INSURANCE PRESENTATION
HEALTH INSURANCE PRESENTATIONHEALTH INSURANCE PRESENTATION
HEALTH INSURANCE PRESENTATIONSandeep Mane
 

Viewers also liked (6)

Healthcare Payer and Provider Webinar
Healthcare Payer and Provider WebinarHealthcare Payer and Provider Webinar
Healthcare Payer and Provider Webinar
 
Healthcare payer medical informatics and analytics
Healthcare payer medical informatics and analyticsHealthcare payer medical informatics and analytics
Healthcare payer medical informatics and analytics
 
837 preparation for testing
837 preparation for testing837 preparation for testing
837 preparation for testing
 
Business process modelling and e tom telecom
Business process modelling and e tom telecomBusiness process modelling and e tom telecom
Business process modelling and e tom telecom
 
Etom
EtomEtom
Etom
 
HEALTH INSURANCE PRESENTATION
HEALTH INSURANCE PRESENTATIONHEALTH INSURANCE PRESENTATION
HEALTH INSURANCE PRESENTATION
 

Similar to Healthcare payer - Big data integration

Big Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewBig Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewIRJET Journal
 
Cibm work shop 2chapter six
Cibm  work shop 2chapter sixCibm  work shop 2chapter six
Cibm work shop 2chapter sixShaheen Khan
 
A Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data ScienceA Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data Scienceijtsrd
 
hadoop seminar training report
hadoop seminar  training reporthadoop seminar  training report
hadoop seminar training reportSarvesh Meena
 
Data base management system
Data base management systemData base management system
Data base management systemSuneel Dogra
 
Enterprise Data Lake
Enterprise Data LakeEnterprise Data Lake
Enterprise Data Lakesambiswal
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digitalsambiswal
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data LakeMetroStar
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundationshktripathy
 
Advancing life sciences with IBM reference architecture for genomics
Advancing life sciences with IBM reference architecture for genomicsAdvancing life sciences with IBM reference architecture for genomics
Advancing life sciences with IBM reference architecture for genomicsPatrick Berghaeger
 
Database Systems
Database SystemsDatabase Systems
Database SystemsUsman Tariq
 
EMC Isilon Scale-Out NAS for In-Place Hadoop Data Analytics
EMC Isilon Scale-Out NAS for In-Place Hadoop Data AnalyticsEMC Isilon Scale-Out NAS for In-Place Hadoop Data Analytics
EMC Isilon Scale-Out NAS for In-Place Hadoop Data AnalyticsEMC
 
Data warehouse
Data warehouseData warehouse
Data warehouseRajThakuri
 
Introduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone ModeIntroduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone Modeinventionjournals
 
BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...
BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...
BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...dbpublications
 
Hadoop-based architecture approaches
Hadoop-based architecture approachesHadoop-based architecture approaches
Hadoop-based architecture approachesMiraj Godha
 

Similar to Healthcare payer - Big data integration (20)

Big Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewBig Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A Review
 
Connect July-Aug 2014
Connect July-Aug 2014Connect July-Aug 2014
Connect July-Aug 2014
 
Cibm work shop 2chapter six
Cibm  work shop 2chapter sixCibm  work shop 2chapter six
Cibm work shop 2chapter six
 
A Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data ScienceA Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data Science
 
hadoop seminar training report
hadoop seminar  training reporthadoop seminar  training report
hadoop seminar training report
 
Data base management system
Data base management systemData base management system
Data base management system
 
E018142329
E018142329E018142329
E018142329
 
Enterprise Data Lake
Enterprise Data LakeEnterprise Data Lake
Enterprise Data Lake
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
 
Big data analysis concepts and references
Big data analysis concepts and referencesBig data analysis concepts and references
Big data analysis concepts and references
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundations
 
Advancing life sciences with IBM reference architecture for genomics
Advancing life sciences with IBM reference architecture for genomicsAdvancing life sciences with IBM reference architecture for genomics
Advancing life sciences with IBM reference architecture for genomics
 
Database Systems
Database SystemsDatabase Systems
Database Systems
 
EMC Isilon Scale-Out NAS for In-Place Hadoop Data Analytics
EMC Isilon Scale-Out NAS for In-Place Hadoop Data AnalyticsEMC Isilon Scale-Out NAS for In-Place Hadoop Data Analytics
EMC Isilon Scale-Out NAS for In-Place Hadoop Data Analytics
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Introduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone ModeIntroduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone Mode
 
BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...
BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...
BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...
 
Hadoop-based architecture approaches
Hadoop-based architecture approachesHadoop-based architecture approaches
Hadoop-based architecture approaches
 
[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia
[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia
[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia
 

Recently uploaded

How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Healthcare payer - Big data integration

  • 1. Table Of Contents Abstract Title: MMIS - Big Data Integration ................................................................................3 Author : Rajasekaran Kandhasamy.........................................................................................3 Overview of Paper: ...................................................................................................................................3 Introduction: .........................................................................................................................................3 Note: .................................................................................................................................................3 Functional Specification:...........................................................................................................................3 Primary Use Cases:................................................................................................................................3 1) Unified File Management (UFM) - Heterogeneous Storage Solutions (HSS) [or] Cloud Data Center (CDC): ....................................................................................................................................3 2) Unified Claims Archiving (UAC).................................................................................................4 3) Extract, Transform and Load (ETL) Integration.........................................................................4 4) Process Large Audit or Log Files................................................................................................5 Secondary Use Cases: ...........................................................................................................................5 5) Near - Continuous data protection (CDP) or Backup & Recovery.............................................5 Value to Payers: ........................................................................................................................................6 Technical Specification:.............................................................................................................................7 High Level Architecture:........................................................................................................................7 1) Unified File Management (UFM) - Heterogeneous Storage Solutions (HSS) [or] Cloud Data Center (CDC) Flow:................................................................................................................................8 2) Unified Claims Archiving (UAC):....................................................................................................9 HBase Claim Archival Sample Data Model:.....................................................................................10 3) Extract, Transform and Load (ETL) Integration:..........................................................................11 Proposed system:............................................................................................................................11 Design Option 1:..............................................................................................................................11 Design Option 2:..............................................................................................................................11 4) Process Large Audit or Log Files:.................................................................................................12 Security Audit .............................................................................................................................12 Application Audit ............................................................................................................................12 5) Near - Continuous data protection (CDP) Backup & Recovery:..................................................13 Proposed system:............................................................................................................................13 Backup:............................................................................................................................................13
  • 2. Recovery: ........................................................................................................................................14 Hive Claim Backup Sample Data Model:.........................................................................................14
  • 3. Abstract Title: MMIS - Big Data Integration Author : Rajasekaran Kandhasamy (krajasekaranmca@gmail.com) Overview of Paper: Introduction: MMIS/HealthCare Payer Applications depend upon traditional data base models and structured data analytics to fulfill their needs. These approaches, while adequate in the past, will not suffice to address future requirements. They lack the processing capability to load and query multi- terabyte datasets in a timely fashion and the flexibility to effectively manage unstructured and semi- structured data. Adapti g Big Data platfo to MMI“ appli atio ill esol e a o e issues. This technical paper provides details about integrating MMIS/HealthCare payer appli atio s ith Hadoop ased Big Data platfo . Note: MMIS is big application and hence paper covers use cases only related to claims. The proposal is not a replacement for OLTP database approach and just an idea what benefits we can get if we integrate with Hadoop technologies. Another Non MMIS application covered here is MMIS BI/BIRT/JASPERSOFT is nothing but business intelligence analytical tools with chart/report capabilities. Simply open source BI or reporting tool to manage big data activities. Functional Specification: Primary Use Cases: 1) Unified File Management (UFM) - Heterogeneous Storage Solutions (HSS) [or] Cloud Data Center (CDC): is a multi-source data collection/management platform that delivers backup, archive, search, and analytics capabilities to Medicaid or HealthCare Payer applications. Introducing cloud based options for data that must be kept for extremely long periods of time. Simply, consolidate MMIS file transfer processes under one managed solution. CDC provides connectivity for the flow of data in the form of files between providers, state agencies and switch vendors and Enterprise System. Within the context of MMIS, A CDC/HSS is one which describes an interaction between external entities, system or service agency (e.g. Switch Vendor, Provider Agency) with the MMIS/Payer applications. This interaction could involve transfer of data and store the data. Most of these external interfaces will be file based inbound and outbound interfaces which come in batches. External systems will be exchanging the data with the MMIS in different formats. Each data file may contain one or more records. The possible file formats are: a. X12 files b. XML files c. Flat files (Delimited and Fixed width data, comma separated values) d. Binary data
  • 4. Advantages:  While the majority of files in most payer applications are stored on IT-managed file servers that are not always under the direct control of IT. Here we maintain all in one umbrella.  Reduce li e si g, ai te a e a d suppo t osts of file se e s. De elop a o vendor lock-i f a e o k.  Scalable storage (Hadoop) environment with CDC a solutio that does ’t require a change in platforms or the retraining of IT employees and administrators. CDC delivers comprehensive capabilities more efficiently than ad hoc data/file management systems do, allowing an enterprise to dedicate fewer resources to supporting infrastructure than to innovating, so it can quickly bring its innovation to market.  Reduce MMIS operational costs with respect to data storage, backup, archive and maintenance.  Minimize analytical latency in big data like environment.  External systems can connect to UFM/CDC/HSS using any FTP/SFTP client or REST based Resource Oriented Architecture (ROA). 2) Unified Claims Archiving (UAC): Preserve information for compliance, legal, business reference, or system optimization purposes. Archiving capabilities that MEDICAID may not believe they need now but, given current archive market trends, will be extremely useful to them in the near future. The combination of increasing manual and machine-generated data and increasingly larger file/message/database sizes. For a variety of reasons, a good portion of this data needs to be archived. Once a claim case was closed it became the responsibility of the MMIS/HealthCare Payer to archive and manage the closed file in compliance with regulations. Advantages:  Data associated with claims processing is a good candidate for data archival. If the size of a production table gets too large there will be a distinct impact to retrieval time. Most of the core system screens include limits on the number of records the screen will retrieve and display. When tables are large, screens will not display all of the applicable data and some screens will not function.  Moving old data from MMIS OLTP to Hadoop based HBase can increase MMIS OLTP performance. More number of unwanted/old unused records in claims table would decrease the performance.  Historical information for comparative and competitive analysis.  Enhanced data quality and completeness.  Supplementing disaster recovery plans with another data backup source. 3) Extract, Transform and Load (ETL) Integration: Most ETL software packages require their own servers, processing, databases, and licenses. They also require setup,
  • 5. configuration, and development by experts in that particular tool, and those skills are not always transferable. Advantages:  For instance MMIS reference sub module receives different set of procedure, diagnosis and other codes as files from CMS periodically. These files are stored in CDC and Hadoop based Pig/Hive used to convert this file data into MMIS understandable format with less expensive manner. Note: Most of the existing MMIS uses Pl/SQL based ETL for loading data into MMIS DB. If the MMIs don’t want to break existing flow then use CDC adapter to get file from Hadoop instead of traditional file server.  UFP/ CDC can easily be stored inexpensively in the cloud and processed by Hive to ETL data. It is a cost-effective complement to data warehouse solutions, and it reduces risk, cost, and/or improves accessibility over in-house solutions. Once data is processed and stored in Hive it does make sense to consider the various file formats available. 4) Process Large Audit or Log Files: Audits are historical and immutable. We can segregate MMIS audits in two categories. a) Security Audit: MMIS application keep logging user actions in the form MMIS file and this file will be moved to Hadoop based HBase NoSql DB. Through MMIS BI application user can view who logged in, what actions he performed and so on. b) Application Audit: Existing MMIS have following application audit options, 1) DB Triggers, 2) Module specific code will insert for each user operation. E.g. Error codes view history. New proposal uses JMS or queue: A scalable approach, if you really need it, and one that is completely in line with the J2EE specification, is to use JMS. That is, publish your audit log messages to a message queue, and another, separate process (Flume), can take them off the queue and log them either in Cloud Data Center (CDC) based HBase NoSQL database. Secondary Use Cases: 5) Near - Continuous data protection (CDP) or Backup & Recovery: Near-continuous data protection (near CDP) is a general term for backup and recovery products that take backup snapshots at set intervals. CDP technology protects data on a nearly continuous basis. Rather than running a large monolithic backup overnight, CDP products back up data every few minutes, 24 hours a day. Advantages:  N-CDP is a Hadoop-based backup solution that efficiently and cost-effectively protects business-critical healthcare data such as databases, and files.
  • 6.  By default Hadoop enables near-instant recovery from disasters and other replication features.  By providing continuous and periodic protection, N-CDP allows organizations to enhance or eliminate their tape-backup infrastructures, minimizing software license and maintenance fees as well as hardware and tape costs.  Recovery Point Objective (RPO) refers to the point in time in the past to which you will recover.  Recovery Time Objective (RTO) refers to the point in time in the future at which you will be up and running again. Difference between CDP and N-CDP:  CDP backup the data for every action on data. But N-CDP take backup on user defined regular interval. Value to Payers:  Proudly say Payers is in cloud and big data market.  The cloud based data centers can subscribe by other parties/state/payer with agreed SLAs. So there is no separate data centers maintenance required for each state or payer.  Reduce licensing, maintenance and support costs. Go with a o e do lo k-i framework. Wherever possible avoid licensing software run along with MMIS and go with open source proposed tools. For e.g, a) Informatica - Use CDC based Hadoop ETL tools. b) FTP Server - Use CDC. c) COGNOS or Other BI tools - Use MMIS BI/BIRT/JASPERSOFT based open source analytical tool. d) Archive and backup tool - Use proposed approach.  By developing more operational and analytical related use cases with this integration will move Payers into business intelligence tool market.  SaaS/multi-tenant enabled MMIS BI application can use by several customers with low infrastructure cost maintenance.  Much and more big data advantages.
  • 7. Technical Specification: High Level Architecture: Maryland MMIS Tenant (E.g.) SFTP over Hadoop Providers Agencies Others Inbound Landing Zone Outbound Landing Zone Claims/Reference/TPL Member Provider Others D a t a M a r t HBase/ Hive Flume Pig/Sqoop Oozie Tools/YARN EHR/Cognos/OthersMMIS BI/BIRT/JASPERSOFT DB MMIS/ HealthCare Payer DB
  • 8. 1) Unified File Management (UFM) - Heterogeneous Storage Solutions (HSS) [or] Cloud Data Center (CDC) Flow: Cloud Data Center: i. External clients can upload files to their dedicated inbound directory through FTP/SFTP. ii. Here we use apache mina based customized SFTP to support Hadoop file system. iii. Once the files are placed, the MMIS listening queues pick file from HDFS and start claim processing as per the above flow. iv. Also MMIS BI to be capable of REST enabled service to upload files. MMIS BI Unified File Management: i. UFM is one of the sub module in MMIS BI application. ii. Using SaaS MMIS BI, user can view complete inbound and outbound file details under single point of access for the particular tenant. iii. Different kind of charts/metrics used to monitor day to day file activities in CDC. Note: Software As A Service (SaaS) MMIS BI name depicts that the application is tenant aware. So same application services can be used by other subscribers. EDI Claim Flow Paper Claim Flow Reference File Loading Emdeon/OCR SFTP over Hadoop EDI Claims Paper Claims Reference Files Claims/Ref FTP/SFTP Clients CDC -HDFS Img Archival PL/SQL Hippa Validation Claims Loader Hippa Translation Loading Process MMIS DB Claim OCR Data SaaS - MMIS BI Unified File Management File Monitoring Ref File Read & Load
  • 9. 2) Unified Claims Archiving (UAC): MMIS APP:  Through MMIS application user can perform different type of archival as per above diagram.  Once archival initiated, Sqoop module will trigger. This Sqoop module load detail from MMIS Db to CDC based HBase db. Claim HBase data structure depicts in below diagram. MMIS BI:  Through MMIS BI application user can view different type of claims related charts. This will read data from HBase NoSQL DB.  O e est e a ple is Ope atio al Met i s , he e use a ie paid lai s th ough certain period of the time.  Also appli atio suppo t A al ti al Met i s , he e use a ie lai fo e ast fo certain period of time. MMIS DB Cloud Data Center - Hadoop HBase - NoSQL Database Claims Data Mart Archive Table Backup Table Other Tables MMIS BI Apache Sqoop MMIS APP Provider type based archival Claim type based archival Date wise archival Claim status based archival Quarterly archival Yearly archival Claims Archive Operational Metrics (PAST) Claims Archive Analytical Metrics (FUTURE)
  • 10. HBase Claim Archival Sample Data Model:  Around 50 - 80 tables are involved in claim adjudication related process. The below section depicts mapping OLTP claim data model to HBase based NoSQL data model.  HBase currently does not do well with anything above two or three column families so in this design we have one column family for all header related tables and one for claim line related tables.  I elo diag a all heade elated ta le e t ies go i to HEADER FAMILY a d li e ite elated ta le e t ies go to LINE FMAILY .  “<ChildTableName_RecordNumber_ColumnName> is the generic format to insert the values. This is nothing but mapping OLTP one to many to NoSQL tables. E.g: OLTP claim Header cutback table entries go here as CUTBACK_1_QLFR= CUTBACK (Table name), 1(Record Number), QLFR (Column name).  Row - Key: <claimfiledate_claimtype_providername> HBASE DATABASE CLM_ARC_TB HEADER FAMILY TCN CUTBACK_1_QLFR TPL_1_AMT LINE FAMILY ATTACHMENT_1_NAME PRVDR_1_LCTN PROCEDURE_1_CODE
  • 11. 3) Extract, Transform and Load (ETL) Integration: Taxonomy codes, HCPCS, Correct Coding Initiative (CCI), Diagnosis Related Group Codes (DRG), Medicare Physician Fee Schedule (MPFS), ICD‑10, Clinical Lab Fee Schedule codes are the few interface reference files where payer will receive from CMS/State/Others. All are claim reference codes to adjudicate the claims and these needs to update periodically in MMIS DB. Proposed system: Design Option 1:  CMS/State/Others can place the reference files in CDC.  MMIS DB procedures pick the files from CDC and start loading the file content into MMIS DB. Design Option 2:  CMS/State/Others can place the reference files in CDC.  Apache Pig application is the ETL transaction model that describes how a process will extract data from a CDC, transform it according to a rule set and then load it into Apache Hive.  Apache Sqoop loads the details from Apache Hive to MMIS DB. CMS/State/Others MMIS DB Cloud Data Center - Hadoop HIVE Reference Data Mart Reference Table Apache Sqoop Apache Pig - ETL HDFS EXTRACT TRANSFORM LOAD
  • 12. 4) Process Large Audit or Log Files:  Whenever user logged into the system MMIS start capturing user page actions in file format.  Apache flume listening to this file and whenever row added this information moved to HBase DB.  User can use BI tool view the details in allowed formats. Security Audit Application Audit Security Log File MMIS APP Cloud Data Center - Hadoop HBASE SECUITY_AUDIT Table Apache Flume Live Streaming All User Actions MMIS BI Live Data where can see logged in user actions JMS Queue MMIS APP Cloud Data Center - Hadoop HBASE APPLICATION_AUDIT Table Apache Flume Application audit All User Modifications MMIS BI Live Data where can see logged in user modifications
  • 13. 5) Near - Continuous data protection (CDP) Backup & Recovery: Proposed system: Backup: Design Option 1: One time full load and subsequent update based on time stamp.  MMIS BI integration module triggers backup service at every one hour.  Backup service calls java sqoop client with claim tables as parameter.  One time activity: Sqoop get connect with MMIS DB and start to import complete table data. Its start with header table and subsequent child table will get load iteratively.  Sqoop support alternate table update strategy supported is called lastmodified mode. So when rows of the source table (MMIS table) may be updated, and each such update will set the value of a last-modified column to the current timestamp. Only those records get update in Hive side and new records will be inserted as usual in Hive. Design Option 2: Complete MMIS snapshot every time.  MMIS BI integration module triggers backup service at every one hour.  Backup service calls java sqoop client with claim tables as parameter.  Every time sqoop import complete data from MMIS DB. Header table first and child tables next.  Each primary key appended with job trigger time and this will be used in recovery time. Cloud Data Center - Hadoop MMIS DB … HIVE Claims Header Claims Line Claims Other Tables Sqoop Import Sqoop Export Backup MMIS BI Backup ESB BackupService Recovery UI -Enter Recovery Point RecoveryService Load data from mentioned time Regular Interval Recovery
  • 14. Recovery:  If e hoose Desig Optio e o e ti e is ot e ui ed. Be ause this is exact MMIS DB copy.  If e hoose Desig Optio the e o e ti e is list of trigger time from MMIS BI context. So user can choose any one of the time and snap shot data obtained during that time will get load from cloud to MMIS DB.  If i ase a disaste happe s to MMIs DB, e a use Re o e odule to load data from cloud to MMIs.  Sqoop will export the data from Hive table to MMIS tables Hive Claim Backup Sample Data Model:  There are no differences in MMIS data model and HIVE data model.  Mo e o less all a e sa e fo Desig Optio .  Fo Desig Optio all p i a ke s will be appended with additional timestamp surrogate key.