• Like

How UnitedHealth Group Integrated Open Data into its MongoDB-Based Information System

  • 322 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
322
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
9
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Our business model is referred to as an “open core” business model. We provide subscription-based licenses on top of our OpenSource, Apache licensed products.
    We also offer professional services and training

    You can see form the chart that our business has grown dramatically since its inception and that we continue to grow at an impressive rate: on average we are more than doubling every year.

    We are recognized for our leadership and innovation broadly in the integration market, but also specifically in each of the products areas just mentioned. Both Gartner and Forrester recognize us as a visionary leader in these product categories.
  • Our area is focused on Encounter Data Processing and Incremental Data Capture on the Medicare and Retirement side.

  • Clinical Performance and Compliance:
    Encounter Data Processing
    Incremental Data Capture
  • Support – Not all Open Source Vendors Support MS Access/Excel Ingestion Equally
    Shared Repository – Source Control / Continuous Integration Hooks
    TAC – All of the Scheduling Bits
    Distant Run – Server Side Processing w/Studio Integration/Stats
    AMC – Run Time Stats
  • Production MongoDB Since December 2012
  • Production MongoDB Since December 2012
  • RAF Impacts:
    Chart Review
    MWOV
    COI
    PAFs
    Hospital Data Capture
    Encounter Submission
    Claims Verification
    RADV
  • The Landscape:

    Flat Files
    MS Excel
    MS Access
    Oracle
    SAS
    SQL Server
    SAP BO / Crystal Reports Server

    Big Data
    MongoDB
    Hadoop
  • Data Load  Report Output  Notifications & Other Triggered Actions

    Talend has direct integration with MongoDB
  • The start of modeling for RADAR – 4QTR2012
    Trading Partners
    Inventory / Tracking
    Transactional Details – Multiple Inbound Versions / Attribute Disparity
  • NPPES  3NF would ~ 10 Tables

Transcript

  • 1. © Talend 2012 Ecosystem How UnitedHealth Group Integrated Open Data into its MongoDB-Based Information System Cedric Carbone CTO at Talend Twitter: @carbone Matt Axsom Lead Architect at Optum matthew.j.axsom@optum.com
  • 2. © Talend 2012 Key Facts • Founded in 2006 • $102m raised since inception • 400+ employees in 7 countries • Open Core business model • Subscription license • Services & training Solutions • Highly scalable integration solutions addressing Big Data, Application Integration, Data Integration, Data Quality, MDM, BPM • Recognized as a visionary leader in the integration market by Gartner and Forrester Talend Overview Growth-Driven Deployment Model 2007 2008 2009 2010 2011 2012 2013 BRAND AWARENESS VIBRANT COMMUNITY MONETIZATION CUSTOMER LOYALTY
  • 3. © Talend 2012 The Talend Difference © Copyright Talend 2011. All rights reserved.
  • 4. © Talend 2012 Talend Open Studio
  • 5. © Talend 2012 Optum Overview ➜ Clinical Performance and Compliance • Encounter Data Processing • Incremental Data Capture © Copyright Talend 2011. All rights reserved.
  • 6. © Talend 2012 Talend Usage ➜ Talend can be leveraged for combined process automation, orchestration, and execution to get increased business continuity and massive ROI ➜ Empower your business users to make changes to their analytics and business intelligence jobs with little or no IT involvement © Copyright Talend 2011. All rights reserved.
  • 7. © Talend 2012 RADAR Overview ➜ RADAR – Risk Adjustment Data Acquisition and Analysis Repository • Medicare and Retirement Claims Related Data • EHR – Electronic Health Record • EMR – Electronic Medical Record • HL7 – Health Level 7 © Copyright Talend 2011. All rights reserved.
  • 8. © Talend 2012 RADAR Overview – Cont. ➜ Solution Benefits • Trading Partner File Submissions Tracked via GridFS • Transactions via Collections, Correlated w/Metadata • Downstream File Outputs are Stored in GridFS • Audit Traceability from Cradle to Grave Simplified Auditability / Research / Analysis of Submission Data by Storing in a Consistent State / Platform Removing the Reliance on Network Shares, etc. © Copyright Talend 2011. All rights reserved.
  • 9. © Talend 2012 Trading Partner Submissions Commercial Payers Transaction Types Provider Direct Hospital Data Capture Chart Review Delete Submissions Government Payers Hospitals Physicians Practice Management/HIS Clearinghouses Portal / Direct Direct Direct Direct Connect Direct Connect Clearinghouses Commercial Payers Direct API
  • 10. © Talend 2012 RDBMS RADAR Architecture MongoDB Trading Partner Submission CMS Input Data Source Translation ETL Talend/ETL Trading Partner Encounters Submissions GridFS Workflows Rules CMS Submission Status Detailed Reports TX Submission MMR/MOR
  • 11. © Talend 2012 How did we get here? ➜ Talend • 30 Minutes Anyone? • Orchestration – 3rd Party ➜ MongoDB • File Systems • The Data Intake Problem • RDBMS Modeling Woes ➜ Hadoop • HBase Does NoSQL Right? • Hadoop Snapshot © Copyright Talend 2011. All rights reserved.
  • 12. © Talend 2012 30 Minutes Anyone? SELECT * FROM TX_DETAIL WHERE DATE = ‘20140623’; Requirements: • Input Date Parameter • Output to Excel • Run Daily • Status Email © Copyright Talend 2011. All rights reserved.
  • 13. © Talend 2012 Orchestration – 3rd Party ➜ SAS • Autoexec Injection • System calls / error trapping ➜ SSIS / Other • System calls / error trapping ➜ SAP Web Hooks • REST • SOAP No Utility Scripting Required © Copyright Talend 2011. All rights reserved.
  • 14. © Talend 2012 File Systems ➜ Solution Options? • BFILE – Link / Pointer to File System Object • HDFS – Hadoop Distributed File System • GridFS - MongoDB © Copyright Talend 2011. All rights reserved.
  • 15. © Talend 2012 The Data Intake Problem Data Source 1 Application A Client 6 Data Client 2 Data Client 4 Data Client 3 Data Client 5 Data Application D Application B Application E Application C MS Access Flat File PDF Custom File Format UNIX SAS Flat File Image SQL Loader Custom Code Custom Code Oracle Data Integrator UNIX SSIS
  • 16. © Talend 2012 RDBMS Modeling Woes © Copyright Talend 2011. All rights reserved.
  • 17. © Talend 2012 HDFS2 (Redundant, Reliable Storage) YARN (Cluster Resource Management) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, Spark) GRAPH (Giraph) NoSQL (MongoDB) Events (Falcon) ONLINE (HBase) OTHER (Search) Tap – Transform – Deliver TRANSFORM (Data Refinement) PROFILE PARSEMAP CDCCLEANSE STANDARD- IZE MACHINE LEARNING MATCH TAP (Ingestion) SQOOP FLUME HDFS API Storm HIVE 800+ DELIVER (as an API) ActiveMQKaraf NoSQLCXF/Camel KafkaStorm MetaSecurity MDMiPaaS GovernHA
  • 18. © Talend 2012 Generate Pure Map Reduce
  • 19. © Talend 2012 Pig Latin Script generation
  • 20. © Talend 2012 HiveQL generation
  • 21. © Talend 2012 HBase Does NoSQL Right? ➜ Availability and Agility • HDFS Reliance – Upgrade Downtime • HBase Production – 2014Q1 ➜ Operational Simplicity • MongoDB Backup & Restore ➜ Secondary Indexing (HBase December 2012) • Primary Key Only = Full Table Scan © Copyright Talend 2011. All rights reserved.
  • 22. © Talend 2012 Hadoop Snapshot ➜ Existing Environment – 20 Table Join – ~4B Total Rows – ~865K Row Input Driver – ~100 Hour Execution ➜ 5 Node Cluster – Hive Query ~18 Hours – Hive Tuned ~45 Minutes © Copyright Talend 2011. All rights reserved.
  • 23. © Talend 2012 24 Hadoop Usage ➜ OLTP Platform Data Load to Hadoop • File to HDFS (via Talend) • Sqoop to HDFS / Hive (via Talend) ➜ Talend • Hive – SQL 92ish • Pig - Transformation • Map-Reduce (Data Read / Shuffle & Sort) Low Level Java • HCatalog* – Shared Schema / Data Typing • Partition Notifications ➜ SAS • Hive ➜ Tableau / SAP BO 4 • Hive
  • 24. © Talend 2012 RADAR Solution – Current State ➜ 100+ Jobs in TAC – Daily / Weekly / Monthly ➜ Moving / Migrating Jobs to Data Services (Real Time) • Leveraging tMOM and Camel ➜ Big Data • Migrating all core ODS to MongoDB • ETL/ELT and Analytics offload to Hadoop © Copyright Talend 2011. All rights reserved.
  • 25. © Talend 2012 RADAR Solution – Future ➜ User Interface (Meteor/Node.js) • Talend Jobs & Services Exposure / Interaction • Self Service – Audit and Traceability – Big Data Mashups ➜ Expand Data Services • API/REST/SOAP ➜ Big Data • DI / DQ YARN Tasks © Copyright Talend 2011. All rights reserved.
  • 26. © Talend 2012 NPI Engine ➜ NPI - National Provider Identifier • NPI is a Health Insurance Portability and Accountability Act (HIPAA) Administrative Simplification Standard. The NPI is a unique identification number for covered health care providers. ➜ Monthly NPI File Size is ~ 4GB • Contains 4+ million records • 329 Columns • 15 Taxonomies (75 columns – 5 column for each) • 50 Other Provider Identifiers (200 columns – 4 column for each) • 2 Addresses • 3 Names • Additional Attributes © Copyright Talend 2011. All rights reserved.
  • 27. © Talend 2012 NPPES Dataset ➜ NPI Permissible Uses • Cross Reference for Fraud Waste and Abuse • Debt Collection • Transaction Processing and Provider Communication • Coordination of Benefits • Provider to Patient Medical Record Linkage © Copyright Talend 2011. All rights reserved.
  • 28. © Talend 2012 NPPES Dataset – Cont. ➜ Why we care • Source of Truth for NPI • PECOS + NPPES like Mashups • Augment With Internal Data SSN/TIN • Provider Lookup Service • Add Geo Location • Augment With Internal Provider Data • Excluded Provider Services • Fraud Prevention and Detection © Copyright Talend 2011. All rights reserved.
  • 29. © Talend 2012 NPPES and RADAR ➜ OIG List of Excluded Individuals/Entities ➜ General Services Administration’s Excluded Parties List System (Retired)  SAM (System for Award Management) ➜ Acceptable Physician Specialty Types ➜ Possible Additions • Death Master • Provider Data Feeds © Copyright Talend 2011. All rights reserved.
  • 30. © Talend 2012 Talend Job © Copyright Talend 2011. All rights reserved.
  • 31. © Talend 2012 Output © Copyright Talend 2011. All rights reserved.
  • 32. © Talend 2012 Output "taxonomies" : { "207X00000X" : { "licence" : "12637", "primary" : "Yes", "state" : "NE" } }, "addresses" : { "Mailing Address" : { "addresstype" : "Mailing Address", "addresstype_id" : 1, "line1" : "PO BOX 2168", "line2" : "", "city" : "KEARNEY", "state" : "NE", "postal_code" : "688482168", "country" : "US", "telephone" : "3088652512", "fax" : "3088652506" }, "Practice Location Address" : { "addresstype" : "Practice Location Address", "addresstype_id" : 2, "line1" : "3500 CENTRAL AVE", "line2" : "", "city" : "KEARNEY", "state" : "NE", "postal_code" : "688472944", "country" : "US", "telephone" : "3088652512", "fax" : "3088652506" } } } © Copyright Talend 2011. All rights reserved.
  • 33. © Talend 2012 Questions
  • 34. © Talend 2012 Acronyms ➜ RADAR – Risk Adjustment Data Acquisition and Analysis Repository ➜ ASM – Alternative Submission Method ➜ NPI – National Provider Identifier ➜ NPPES – National Plan and Provider Enumeration System ➜ CMS – Centers for Medicare & Medicaid Services ➜ PECOS – Provider Enrollment Chain and Ownership System © Copyright Talend 2011. All rights reserved.
  • 35. © Talend 2012 References ➜ http://oig.hhs.gov/oei/reports/oei-07-09-00440.pdf - Daniel R. Levinson, Inspector General ➜ http://www.cms.gov/Regulations-and- Guidance/HIPAA-Administrative- Simplification/NationalProvIdentStand/DataDissemina tion.html © Copyright Talend 2011. All rights reserved.
  • 36. © Talend 2012 Code ➜ https://github.com/Optum/MongoDBWorld2014 - Talend Code for NPI Load / Read See Reference Links for File Specifications, etc. © Copyright Talend 2011. All rights reserved.
  • 37. © Talend 2012 Session Links ➜ http://nppes.viva-it.com/NPI_Files.html - NPPES Dataset ➜ http://www.cms.gov/Regulations-and- Guidance/HIPAA-Administrative- Simplification/NationalProvIdentStand/DataDissemina tion.html - NPPES File Layout ➜ http://www.cms.gov/Medicare/Provider-Enrollment- and- Certification/MedicareProviderSupEnroll/Taxonomy.ht ml - Taxonomy Crosswalk © Copyright Talend 2011. All rights reserved.
  • 38. © Talend 2012 Session Links – Cont. ➜ http://www.csscoperations.com/internet/cssc3.nsf/D ocsCat/CSSC~CSSC%20Operations~Risk%20Adjustmen t%20Processing%20System~References~94LTWU2832 ?open&navmenu=Risk%5EAdjustment%5EProcessing %5ESystem%7C%7C%7C%7C – Acceptable Physician Specialty Types © Copyright Talend 2011. All rights reserved.