Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Stream is the Database - Revolutionizing Healthcare Data Architecture

897 views

Published on

The Stream is the Database - Revolutionizing Healthcare Data Architecture

Published in: Technology
  • Be the first to comment

The Stream is the Database - Revolutionizing Healthcare Data Architecture

  1. 1. The Stream is the Database - Revolutionizing Healthcare Data Architecture Brad Anderson, VP Big Data Informatics, Liaison Will Ochandarena, Director of Product, MapR
  2. 2. About Us Now: • Head of Data Management @ Liaison • Board Member @ OnKöl Before: • SE @ MapR • Founder @ Heartbyte • Founder @ Verdeeco Now: Product guy for Streams @ MapR Before: • OpenStack product guy @ AMD/SeaMicro • Network product guy @ Cisco Brad Will
  3. 3. Agenda • “Stream System of Record” - A Techie Concept (Will) • Applied “Stream System of Record” @ Liaison (Brad)
  4. 4. “Stream System of Record” Concept
  5. 5. What’s a Stream Again? Producers ConsumersEvents_Stream A stream is an unbounded sequence of events carried from a set of producers to a set of consumers. Events
  6. 6. What’s a Stream Again? Unlike with a queue, events are persisted even after they’re delivered. Events are delivered in the order they are received, like a queue.
  7. 7. What Does That Have to Do With a Database? DMV_Updates Imagine each event as a change to an entry in a database. DL_ID City Points 0: { WillO : {City : Mountain View}, ts : 7/5/2009 04:01:01, src : dmv201 } 1: { BradA : {City : Atlanta}, ts : 5/11/2010 05:11:31, src : dmv1341 } 2: { BradA : {Points : +2}, ts : 6/22/2011 03:31:10, src : officer1213} 3: { WillO : {City : San Jose}, ts : 11/1/2012 04:01:01, src : dmv1661 } WillO BradA Mountain View Atlanta 0 0 San Jose 2
  8. 8. Streams and Databases in Harmony Key-Val Document Graph Wide Column Time Series Relational ???Inserts Updates
  9. 9. What Else Do I Use My Stream For? • Lineage - “how did BradA’s points get so high?” • Auditing - “who added points to BradA license?” • History - “where did WillO used to live?” • Integrity - “can I trust this data hasn’t been tampered with?” • Yup - Streams are immutable 0: { WillO : {City : Mountain View}, ts : 7/5/2009 04:01:01, src : dmv201 } 1: { BradA : {City : Atlanta}, ts : 5/11/2010 05:11:31, src : dmv1341 } 2: { BradA : {Points : +2}, ts : 6/22/2011 03:31:10, src : officer1213} 3: { WillO : {City : San Jose}, ts : 11/1/2012 04:01:01, src : dmv1661 }
  10. 10. Which Makes a Better System of Record? Which of these can be used to reconstruct the other? 0: { WillO : {City : Mountain View}, ts : 7/5/2009 04:01:01, src : dmv201 } 1: { BradA : {City : Atlanta}, ts : 5/11/2010 05:11:31, src : dmv1341 } 2: { BradA : {Points : +2}, ts : 6/22/2011 03:31:10, src : officer1213} 3: { WillO : {City : San Jose}, ts : 11/1/2012 04:01:01, src : dmv1661 } DL_ID City Points Will0 San Jose 0 BradA Atlanta 2
  11. 11. What Do I Need For This to Work? • Infinitely persisted events • A way to query your persisted stream data • An integrated security model across the stream and databases
  12. 12. Applied “Stream System of Record” @ Liaison
  13. 13. Liaison ALLOY™ Platform 13 Data Integration ingest syndicatetransform Data Management master deduplicate harmonize relate merge tokenize store / persist analyze summarize report distill recommend explore query sandbox batch transform learn traverse
  14. 14. 14 ALLOY Health: Exchange State HIE Clinical Data Viewer Reporting and Analytics Clinical Data Financial Data Provider Organizations
  15. 15. 2000+ Practices 200 + Labs 30,000 + Clinicians OrdersAnywhere PORTAL (no EHR) EHR with HL7 ONLY EHR with WORKFLOW INTEGRATION RADIOLOGY LAB
  16. 16. This is a PAIN IN THE ASS COMPLIANCE SECURITY CONTROLS COMPLIANCE FEATURES PRIVACY PCI DSS 3.0 21 CFR Part 11 SSAE16 / SOC2 HIPAA/HITECH
  17. 17. This is a PAIN IN THE ASS First: Fred Last: Smith Age: 85 Zip: 941xx
  18. 18. WHY NOW? 18http://bit.ly/29aBatK
  19. 19. WHY NOW? 2014 FQ4 profit $ -440 M Total Cost Estimate $ -12 B
  20. 20. WHY NOW? 20
  21. 21. 21 Immutable Log Raw Data workflow Key/Value (MapR) materialized view workflow Search Engine (ElasticSearch) materialized view CEP k v v v v v k v v v k v v k v v v v k v v v k v v v v v Document Log (MapR) log API App pre- processor workflow Graph (ArangoDB) materialized view workflow Time Series (OpenTSDB) materialized view micro service micro service micro service micro service micro service micro service micro service micro service App AppApp ... The Promised Land Compliance Auditor
  22. 22. The Promised Land • Auditor smiley faces • Data Lineage • Audit Logging • Wire-level encryption • At Rest encryption • Replication • Disaster Recovery • EU – data can’t leave • Non-Stream / Non-”Big Data” • Software Development Lifecycle • System Hardening • Separation of Concerns • Dev vs Ops • Patch Management 22
  23. 23. Solution • Design/architecture solved some • Streams • Data Lineage/System of Record • Kappa Architecture (Kreps/Kleppman) • MapR solved others • Unified Security • Replication DC to DC • Converge Kafka/HBase/Hadoop to one cluster • Multi-tenancy (lots of topics, for lots of tenants) 23
  24. 24. 24
  25. 25. 25
  26. 26. 26
  27. 27. 27

×