Mainframes | A Critical Source of Big Data

71%

30 Billion

Fortune 500
Top 25
World Banks

Bus. Transactions / day

9

o...
How Mainframes Work
Copy Application Analysis - TCB Time & Count
70,000

60.0

60,000

50.0

50,000
40.0
40,000
30.0
30,00...
The Hadoop Opportunity

Produce POWERFUL INSIGHTS by
combining behavioral data with granular
mainframe data
Increase BUSIN...
Perception: Just Call the Mainframe Guy…

Images: http://monkeestv.tripod.com/BatMonkee/

5
Reality: Far Away… So Close!

Reality
SMS
Compression

SMS
Compression

SMS
Compression

DB Tables,
Flat Files

DB Tables,...
Suits & Hoodies – The Most Unlikely Duo
Integration • Connectivity
• Data conversion (EBCDIC vs ASCII)
Gaps
Expertise
Gaps...
Bridging the Gap Between Big Iron and Big Data

+
A Smarter Approach to BIG
Mainframe Data!

Syncsort DMX-h ETL Edition

C...
Cloudera
The Standard for Apache Hadoop™ in the Enterprise
CLOUDERA UNIVERSITY
ADMINISTRATOR TRAINING
DEVELOPER TRAINING

...
Why Cloudera?

 Assurance – Remove Risk
 Expertise – Maximize Value

Meets Enterprise Requirements
Interactive SQL

Data...
Why Syncsort?
For 40 years we have been helping companies solve their big data
issues…even before they knew the name Big D...
Smart Contributions to Improve Hadoop
Augmenting Critical Batch
Processing Capabilities
JIRA

Description

2452

Allow Ext...
The Smarter Approach to Hadoop ETL… and Mainframe
 Connect – One tool to connect all your data
 Translate - Best in clas...
Integration - Why is Working with Mainframe Data So Hard?

File Definitions
(Metadata) in
Cobol Copybooks

EBCDIC
A B Y Z
...
Cloudera + Syncsort: Smarter Connectivity… Also for Mainframe
Because Mainframe Is Big Data Too!

Connect

• Read files di...
Iron-clad Security. Zero Pain. Zero Mainframe Footprint

Mainframe Provides Iron-clad Security…
So, Why compromise?
Other ...
The Economics of Data
Cost of managing 1TB of data
$20,000 – $100,000

$15,000 – $80,000

$250 – $2,000

Mainframe

EDW

H...
Smarter Deployment, Monitoring & Administration…
…through Cloudera Manager

1
Monitor
2
Diagnose
3
Integrate
4
Manage

Eas...
Get a 360° View of Your Cluster, Including DMX-h Logs
View service health
& performance
Get host-level
snapshots
Monitor &...
Understanding Mainframe Data at Major US Bank
Before: Manual Effort

Weeks

After: DMX-h + CDH

?

86-page copybook

Custo...
Three Quick Takeaways
1. Bring Suits & Hoodies together early
in the process
Build cross-organizational teams
Understand m...
Bridging the Gap Between Big Iron and Big Data

+
A Smarter Approach to BIG
Mainframe Data!

Syncsort DMX-h ETL Edition

C...
Test Drive DMX-h:
Bridge the Gap Between
Big Iron & Big Data!
• Self-contained image
• Use case accelerators for
• mainfra...
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron & Big Data
Upcoming SlideShare
Loading in...5
×

How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron & Big Data

1,727

Published on

In this presentation from Syncsort and Cloudera, you'll learn how to bridge the technical, skill and cost gaps between mainframe and Hadoop. We discuss the top challenges of ingesting and processing mainframe data in Hadoop – and how to solve them.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,727
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron & Big Data

  1. 1. Mainframe + Hadoop Bridging the Gap Between Big Iron and Big Data Matt Brandwein, Director, Product Marketing, Cloudera Jorge A. Lopez, Director, Product Marketing, Syncsort ( + )
  2. 2. Mainframes | A Critical Source of Big Data 71% 30 Billion Fortune 500 Top 25 World Banks Bus. Transactions / day 9 of World’s Top Insurers 23 of Top 25 US Retailers 2
  3. 3. How Mainframes Work Copy Application Analysis - TCB Time & Count 70,000 60.0 60,000 50.0 50,000 40.0 40,000 30.0 30,000 20.0 20,000 10.0 10,000 0 0.0 Time of Day $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ • Every month you pay based on CPU Expensive utilization, commonly measured in MIPS Processing • Any MIPS reduction = Instant OpEx savings Number of COPY Applications per Interval TCB Time (mins) • Storage can reach up to $100K/TB Expensive • Tape technology is often present Storage • Uses compressed data types (EBCDIC, packed decimal) $15.7 Million Typical MIPS costs for the “average” $10B organization 3
  4. 4. The Hadoop Opportunity Produce POWERFUL INSIGHTS by combining behavioral data with granular mainframe data Increase BUSINESS AGILITY & REDUCE COSTS by offloading mainframe data & batch processing to HADOOP Image: quirkyjoe.com 4
  5. 5. Perception: Just Call the Mainframe Guy… Images: http://monkeestv.tripod.com/BatMonkee/ 5
  6. 6. Reality: Far Away… So Close! Reality SMS Compression SMS Compression SMS Compression DB Tables, Flat Files DB Tables, Flat Files DB Tables, Flat Files Filtering , Reformatting Copy, Sort, Join, Aggregation EBCDIC to ASCII Call MF Guy Filtering , Reformatting Copy, Sort, Join, Aggregation EBCDIC to ASCII Call MF Guy Filtering , Reformatting Copy, Sort, Join, Aggregation EBCDIC to ASCII Cobol copybooks Cobol copybooks Cobol copybooks Every Change = Time, Cost Image: bottletales.com 6
  7. 7. Suits & Hoodies – The Most Unlikely Duo Integration • Connectivity • Data conversion (EBCDIC vs ASCII) Gaps Expertise Gaps • COBOL appeared in 1959, Hadoop in 2005 • Mainframe & Hadoop skills shortage Security Gaps • Hosts mission critical sensitive data • Very difficult to install new software on the MF Costs Gaps Suits & Hoodies idea: Merv Adrian, Gartner Research. • Mainframe data is (expensive) Big Data • Even FTP costs CPU cycles (MIPS) 7
  8. 8. Bridging the Gap Between Big Iron and Big Data + A Smarter Approach to BIG Mainframe Data! Syncsort DMX-h ETL Edition Connect Translate Process CLOUDERA THE PLATFORM FOR BIG DATA CDH Cloudera Manager Brings batch & realtime compute to storage - Cloudera Navigator Works with all types of data - Cloudera Support Changes the economics of data management  Zero-MIPS Connectivity  Painless Integration & Translation  Mainframe-like Performance & Reliability  Massively Affordable Scalability  Support for Offloading Batch Cobol & JCL Processing to Hadoop  Iron-clad Security  Decades of Proven Mainframe Expertise  Easy Deployment, Monitoring & Admin 8
  9. 9. Cloudera The Standard for Apache Hadoop™ in the Enterprise CLOUDERA UNIVERSITY ADMINISTRATOR TRAINING DEVELOPER TRAINING Processing Interactive Interactive SQL Search Resource Management Storage ANALYST TRAINING Analytics DATA SCIENCE TRAINING Metadata and Security Batch CLOUDERA SUPPORT CERTIFICATION PROGRAMS PROFESSIONAL SERVICES CLOUDERA MANAGER USE CASE DISCOVERY NEW HADOOP DEPLOYMENT PROOF-OF-CONCEPT PRODUCTION PILOTS Integration 9 CLOUDERA NAVIGATOR PROCESS & TEAM DEVELOPMENT DEPLOYMENT CERTIFICATION
  10. 10. Why Cloudera?  Assurance – Remove Risk  Expertise – Maximize Value Meets Enterprise Requirements Interactive SQL Data Access We Deliver Customer Success with Hadoop in the Enterprise  Customers: Over 50% of the Fortune 50 and 65% of the Fortune 500 plus top US intelligence and defense agencies  Partners: 700+ in hardware, software, and services  Education: 15,000+ trained annually; developers, admins, analysts, data scientists  Community: Founders and top supporters of the Hadoop open source ecosystem working for you 10 Enterprise Capabilities  Influence – Build the Future ✔ Interactive Search ✔ SAS, R Integration ✔ Resource Management ✔ Security ✔ Highly Availability ✔ Disaster Recovery ✔ Audit and Lineage ✔ Online Upgrades ✔ Change Mgmt & Rollback ✔
  11. 11. Why Syncsort? For 40 years we have been helping companies solve their big data issues…even before they knew the name Big Data! Integrating Big Data… Smarter! Our customers are achieving the impossible, every day! • 50% of all mainframes run Syncsort • 1,500 Mainframe Customers: Most used & trusted 3rd party mainframe software • Speed leader for ETL & Sort • A history of innovation • 25+ Issued & Pending Patents • Large global customer base • 15,000+ deployments in 68 countries • First-to-market, fully integrated approach to Hadoop ETL Key Partners 11
  12. 12. Smart Contributions to Improve Hadoop Augmenting Critical Batch Processing Capabilities JIRA Description 2452 Allow External Sorter Plugin for MR 4808 Allow Reduce-side merge to be pluggable 4809 Make classes required for 2454 public 4812 Create reduce input merger plug-in 4842 Shuffle race can hang reducer …and more!! Plugin Shipping on CDH 4.2 and later 12
  13. 13. The Smarter Approach to Hadoop ETL… and Mainframe  Connect – One tool to connect all your data  Translate - Best in class mainframe data access with seamless data translation & COBOL Copybooks support  Process – Hadoop ETL without coding. Develop, test & debug locally in Windows; deploy on Hadoop PLUS… Connect Translate Process  Enterprise-grade security  Smarter deployment, monitoring & administration  Disruptive cost-structure  Decades of Mainframe expertise 13
  14. 14. Integration - Why is Working with Mainframe Data So Hard? File Definitions (Metadata) in Cobol Copybooks EBCDIC A B Y Z A 6 bytes Non-Viewable, Saves space B Y Z x’41’ x’42’ x’59’ x’5A’ x’C1’ x’C2’ x’E8’ x’E9’ Packed Decimal x’12843154976C’ ASCII Conversion Conversion Numeric 12,843,154,976 14 bytes Viewable, WYSIWYG 14
  15. 15. Cloudera + Syncsort: Smarter Connectivity… Also for Mainframe Because Mainframe Is Big Data Too! Connect • Read files directly from mainframe • No software required on mainframe • Already installed on 50% of mainframes Translate • Parse & transform: packed decimal, EBCDIC/ASCII, multi-format • No coding required Load & • Load directly to HDFS • Offload batch data processing Process • Find more insights 15
  16. 16. Iron-clad Security. Zero Pain. Zero Mainframe Footprint Mainframe Provides Iron-clad Security… So, Why compromise? Other solutions Cloudera + Syncsort Requires you to install unproven, untested software on Your Mainframe Adopt their own security model and constantly sync with your own LDAP • Nothing to install on mainframe • Painless support for Kerberos & LDAP • User-level security using authentication protocol • Secure data loads & extracts • Secure job execution 16
  17. 17. The Economics of Data Cost of managing 1TB of data $20,000 – $100,000 $15,000 – $80,000 $250 – $2,000 Mainframe EDW Hadoop But there’s more… Scalability Performance Reliability Agility Skills Supply 17
  18. 18. Smarter Deployment, Monitoring & Administration… …through Cloudera Manager 1 Monitor 2 Diagnose 3 Integrate 4 Manage Easily deploy, configure & optimize clusters Maintain a central view of all activity Easily identify and resolve issues Cloudera Cluster Unleash Hadoop’s Potential Use Cloudera Manager with existing tools + 18
  19. 19. Get a 360° View of Your Cluster, Including DMX-h Logs View service health & performance Get host-level snapshots Monitor & diagnose workloads Gather, view & search Hadoop & DMX-h logs …And more!! + 19
  20. 20. Understanding Mainframe Data at Major US Bank Before: Manual Effort Weeks After: DMX-h + CDH ? 86-page copybook Customer hit a wall after months of manual effort migrating Mainframe data • Difficult to find data errors. No Mainframe application logic that matches Copybook • Large and complex Copybooks • Depends on Mainframe team to provide data • Very manual-intensive ; inadequate documentation • Not scalable. Only a few Java + Mainframe experts could do the work 4 hrs 86-page copybook ( + ) • Easy to validate Copybooks and find data errors • Ability to pull data directly from Mainframe without relying on Mainframe team • No coding. No scripting. Easier to document, maintain & reuse • Enables developers with a broader set of skills to build complex migration jobs. 20
  21. 21. Three Quick Takeaways 1. Bring Suits & Hoodies together early in the process Build cross-organizational teams Understand mutual concerns Identify critical data & applications 2. Clearly define business & IT objectives Reduce costs Uncover new insights 3. Create a roadmap that gradually builds the skills of your organization Copy  Migrate  Offload Image Source: http://archrecord.construction.com/news/2012/03/A-Stitch-in-the-Urban-Fabric.asp 21
  22. 22. Bridging the Gap Between Big Iron and Big Data + A Smarter Approach to BIG Mainframe Data! Syncsort DMX-h ETL Edition Connect Translate Process CLOUDERA THE PLATFORM FOR BIG DATA CDH Cloudera Manager Brings batch & realtime compute to storage - Cloudera Navigator Works with all types of data - Cloudera Support Changes the economics of data management  Zero-MIPS Connectivity  Painless Integration & Translation  Mainframe-like Performance & Reliability  Massively Affordable Scalability  Support for Offloading Batch Cobol & JCL Processing to Hadoop  Iron-clad Security  Decades of Proven Mainframe Expertise  Easy Deployment, Monitoring & Admin 22
  23. 23. Test Drive DMX-h: Bridge the Gap Between Big Iron & Big Data! • Self-contained image • Use case accelerators for • mainframe, Hadoop and more! ( + Try it FREE at: syncsort.com/try Stop by booth #304 Get a demo | Get a shirt )

×