Leveraging Mainframe Data for Modern Analytics

281 views

Published on

“The mainframe is going away” is as true now as it was 10, 20 and 30 years ago. Mainframes are still crucial in handling critical business transactions, they were however built for an era where batch data movement was the norm and can be difficult to integrate into today’s data-driven, real-time, analytics-focused business processes as well as the environments that support them. Until now.

Join experts from Confluent, Attunity, and Capgemini for a one-hour online talk session where you’ll learn how to:

Unlock your mainframe data with unique change data capture (CDC) functionality without incurring the complexity and expense that come with sending ongoing queries into the mainframe database
How using CDC benefits advanced analytics approaches such as deep machine learning and predictive analytics
Deliver ongoing streams of data in real-time to the most demanding analytics environments
Ensure that your analytics environment includes the broadest possible range of data sources and destinations while ensuring true enterprise-grade functionality
Identify use cases that can help you get started delivering value to the business moving from POC to Pilot to Production

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
281
On SlideShare
0
From Embeds
0
Number of Embeds
51
Actions
Shares
0
Downloads
19
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Leveraging Mainframe Data for Modern Analytics

  1. 1. 1Confidential Leveraging Mainframe Data for Modern Analytics
  2. 2. 2Confidential Today’s Speakers Jordan Martz, Director of Technology Solutions, Attunity David Tucker, Director of Partner Engineering, Confluent Keith Reid, Principal, Insights and Data: Client Engagement and Practice Leader, Capgemini
  3. 3. 3Confidential Agenda • A quick history of the mainframe • A quick history of data migration • Attunity - data migration with CDC • Confluent streaming platform powered by Apache KafkaTM • Putting it all together • Use cases and Replicate demo • Answering your questions Image: © Mark Richards
  4. 4. 4Confidential History of the Mainframe Big businesses with big needs required big computers. Demands increased just when “second generation” transistor-based computers were replacing vacuum-tube machines in the late 1950s, spurred developments in hardware and software. Manufacturers commonly built small numbers of each model, targeting narrowly defined markets. Why are they called “Mainframes”? Nobody knows for sure. There was no mainframe “inventor” who coined the term. Probably “main frame” originally referred to the frames (designed for telephone switches) holding processor circuits and main memory, separate from racks or cabinets holding other components. Over time, main frame became mainframe and came to mean “big computer.” Source: The Computer Museum Source: © International Business Machines Corporation (IBM), 1965
  5. 5. 5Confidential Death of the mainframe? What Became of Mainframes? “Mainframes will soon be extinct”, pundits have announced regularly. Yet nobody told the mainframes, which remain alive and well, the backbone of world banking and other business systems. Reliable and secure, mainframes are seldom in the limelight. But one probably approved your last ATM withdrawal or reserved your last airplane ticket. Source: The Computer Museum Source: © International Business Machines Corporation (IBM), 2001
  6. 6. 6Confidential A quick history of data movement Source Data Warehouse Batch Source History
  7. 7. 7Confidential A quick history of data movement Source Data Warehouse CDC Source History
  8. 8. 8Confidential A quick history of data movement – the ODS Source CDC Source ODS (latest view)
  9. 9. 9Confidential This all changes with streaming / big data platforms (eg Kafka and Hadoop) Source CDC Source History History In-Memory Analytics (latest view and events) Point in Time End of Day Data Lake Streaming Platform CEP
  10. 10. 10Confidential So why does CDC work in a Big Data world? Big Data likes volume and likes history • Storage isn't an issue • History helps machine learning Re-creating any point in time is simple • 8 lines of Scala code simple Easiest way to get data without large system performance impacts • Reduces concerns on data integration Enables very rapid response to transactional events • Fraud detection and even consumer response becomes much simpler
  11. 11. © 2016 Attunity Attunity Platform for Enterprise Data Management Attunity Replicate Attunity Compose Attunity Visibility Universal Data Availability Data Warehouse Automation Metrics Driven Data Management Integrate new platforms Automate ETL/EDW Optimize performance and cost On Premises / Cloud Hadoop FilesRDBMS EDW SAP Mainframe
  12. 12. Attunity Replicate
  13. 13. © 2016 Attunity Attunity Replicate No manual coding or scripting Automated end-to-end Optimized and configurable Hadoop Files RDBMS EDW Mainframe • Target schema creation • Heterogeneous data type mapping • Batch to CDC transition • DDL change propagation • Filtering • Transformations Hadoop Files RDBMS EDW Kafka
  14. 14. © 2016 Attunity Data replication and ingest made easy
  15. 15. © 2016 Attunity Zero-footprint Architecture Lower impact on IT • No software agents on sources and targets for mainstream databases • Replicate data from 100’s of source systems with easy configuration • No software upgrades required at each database source or target Hadoop Files RDBMS EDW Mainframe • Log based • Source specific optimization Hadoop Files RDBMS EDW Kafka
  16. 16. © 2016 Attunity Heterogeneous – Broad support for sources and targets RDBMS Oracle SQL Server DB2 LUW DB2 iSeries DB2 z/OS MySQL Sybase ASE Informix Data Warehouse Exadata Teradata Netezza Vertica Actian Vector Actian Matrix Hortonworks Cloudera MapR Pivotal Hadoop IMS/DB SQL M/P Enscribe RMS VSAM Legacy AWS RDS Salesforce Cloud RDBMS Oracle SQL Server DB2 LUW MySQL PostgreSQL Sybase ASE Informix Data Warehouse AWS Redshift Azure SQL DW Exadata Teradata Netezza Vertica Pivotal DB (Greenplum) Pivotal HAWQ Actian Vector Actian Matrix Sybase IQ Hortonworks Cloudera MapR Pivotal Hadoop MongoDB NoSQL AWS RDS/Redshift/S3 Azure SQL Data Warehouse Azure SQL Database Google Cloud SQL Google Cloud Dataproc Cloud Effective: 12/10/2015 Kafka Message Broker targets sources
  17. 17. © 2016 Attunity Real-time data migration of mainframe data
  18. 18. 18Confidential Confluent: Open source enterprise streaming built on Apache Kafka Open Source ExternalCommercial Confluent Platform Monitoring Analytics Custom Apps Transformations Real-time Applications … CRM Data Warehouse Database Hadoop Data Integration Mainframe Control Center Auto-data Balancing Multi-Data Center Replication 24/7 Support Supported Connectors Clients Schema Registry REST Proxy Apache Kafka Kafka Connect Kafka Streams Kafka Core Database Changes Log Events loT Data Web Events …
  19. 19. 19Confidential Stream Data is The Faster the Better Stream Data can be Big or Fast (Lambda) Stream Data will be Big AND Fast (Kappa) From Big Data to Stream Data Apache Kafka is the Enabling Technology of this Transition Big Data was The More the Better ValueofData Volume of Data ValueofData Age of Data Job 1 Job 2 Streams Table 1 Table 2 DB Speed Table Batch Table DB Streams Hadoop
  20. 20. 20Confidential Apache KafkaTM Connect Effective Streaming Data Capture
  21. 21. 21Confidential Apache KafkaTM Connect – Streaming Data Capture JDBC Mongo MySQL Elastic Cassandra HDFS Kafka Connect API Kafka Pipeline Connector Connector Connector Connector Connector Connector Sources Sinks Fault tolerant Manage hundreds of data sources and sinks Preserves data schema Part of Apache Kafka project Integrated within Confluent Platform’s Control Center
  22. 22. 22Confidential Kafka Connect Library of Connectors * Denotes Connectors developed at Confluent and distributed with the Confluent Platform. Extensive validation and testing has been performed. Databases * Datastore/File Store * Analytics * Applications / Other
  23. 23. 23Confidential Apache KafkaTM Streams Distributed Stream Processing Made Easy
  24. 24. 24Confidential Architecture of Kafka Streams, a Part of Apache Kafka Kafka Streams Producer Kafka Cluster Topic TopicTopic Consumer Consumer Key benefits • No additional cluster • Easy to run as a service • Supports large aggregations and joins • Security and permissions fully integrated from Kafka Example Use Cases • Microservices • Continuous queries • Continuous transformations • Event-triggered processes
  25. 25. 25Confidential Kafka Streams: the Easiest Way to Process Data in Apache Kafka™ Example Use Cases • Microservices • Large-scale continuous queries and transformations • Event-triggered processes • Reactive applications • Customer 360-degree view, fraud detection, location- based marketing, smart electrical grids, fleet management, … Key Benefits of Apache Kafka’s Streams API • Build Apps, Not Clusters: no additional cluster required • Elastic, highly-performant, distributed, fault-tolerant, secure • Equally viable for small, medium, and large-scale use cases • “Run Everywhere”: integrates with your existing deployment strategies such as containers, automation, cloud Your App Kafka Streams
  26. 26. 26Confidential Architecture Example Before: Complexity for development and operations, heavy footprint 1 2 3 Capture business events in Kafka Must process events with separate, special-purpose clusters Write results back to Kafka Your Processing Job
  27. 27. 27Confidential Architecture Example With Kafka Streams: App-centric architecture that blends well into your existing infrastructure 1 2 3a Capture business events in Kafka Process events fast, reliably, securely with standard Java applications Write results back to Kafka Your App Kafka Streams 3b External apps can directly query the latest results AppApp
  28. 28. 28Confidential Putting it all together CDC with Attunity on Confluent Enterprise
  29. 29. 29Confidential Back to the high-level platform integration … Mainframe CDC Source History History In-Memory Analytics (latest view and events) Point in Time End of Day Data Lake Streaming Platform CEP
  30. 30. 30Confidential … made real in Attunity / Confluent Data Flow Topic Data Flow • Attunity publishes DB changes to Kafka • ”Raw” connectors (eg FileSink or HDFS) persist change records where needed • K-Streams app reads CDC topic and transforms (as necessary) for other data systems. • Sink connectors (JDBC or K-V as needed) persist that transformed data for other uses. Kafka Streams Producer Kafka Cluster Topic TopicTopic Consumer Consumer Data System Sink Attunity Replicate Raw Sink
  31. 31. 31Confidential Use Cases Query off-load • Mainframe system accepts operational updates • Attunity CDC publishes table updates to Kafka • Certified Confluent Connectors replicate tables to other data systems for read-only queries Business Value Greater analytics flexibility at lower cost, without disrupting operational system Enhanced security • Mainframe audit trails published to Kafka • Syslog and other access events published to other topics • Event correlation via LogStash or similar tools Business Value Enhanced threat detection and end-to-end work-flow auditing Cross-system integration • K-Streams application joins customer data from mainframe customer- specific mobile information • External applications use interactive queries to leverage up-to-the-second customer state Business Value Improved customer engagement, more efficient marketing spend
  32. 32. Attunity Replicate Demo
  33. 33. 33Confidential Thanks !!! Any Questions ? References: • http://discover.attunity.com/knowledge-brief-leveraging- mainframe-data-for-modern-analytics.html • http://confluent.io/product/connectors • https://www.capgemini.com/resources/video/transform-to- a-modern-data-landscape

×