SlideShare a Scribd company logo
1 of 22
Download to read offline
BreatheNewLifeIntoYourDataWarehouseby
OffloadingETLonHadoop
Shahab Kamal
Supreet Oberoi
ABOUT CONCURRENT
2
TRUSTED
by over 10,000
companies as their big
data app platform
BACKED
by top Silicon Valley
investors True Ventures,
Rembrandt VP, Bain
Capital
FOUNDED
in 2008, with
headquarters in San
Francisco
•Founded in
1995
•HQ in
Chicago, IL
•Offices in India
& Australia
•ISO 9001:2008
& ISO
27001:2005
Certified
Most
experienced
data
professionals
Proprietary
frameworks and
accelerators for
guaranteed, efficient
and cost-effective
services for data
projects
ABOUT BITWISE
3
ENTERPRISEENGAGEMENTWITHHADOOP ISGAININGDEPTH…
4
Improving brand experience, creating new
revenue channels, enhancing operational
visibility to risk & compliance, reducing
TCO have been the key drivers, engaging
at levels of CEO, CIO, CDO
Analytics
EMERGINGENTERPRISEARCHITECTURE FORHADOOP
5
Reporting Mining
Analytics
Exploratory
Discovery Search
Data Mart
ReportingData Mining
STAGE TRANSFORM ARCHIVE
Data Lake
CASE STUDY
6
RECOVERY APPLICATIONRECOVERY APPLICATION
DATA SOURCE
ANALYTICSANALYTICS
REPORTINGSREPORTINGS
Developer UI
XML
Custom
Code
Execution Service
Cascading Framework
Generate Cascading Flow
Launch MapReduce Jobs
On Execution
ETL
Application
ETL
Application
RECOVERY APPLICATIONRECOVERY APPLICATION
DATA SOURCE
ANALYTICSANALYTICS
REPORTINGSREPORTINGS
Automated
ETL
Conversion
RDBMSRDBMS
RDBMS
Data
Quality
Monitoring
DataQualityMonitoring
ETL
Testing
ETL
Conversion
QualiDI
Data Quality
Framework
BITWISEETLTOOLARCHITECTURE
7
Developer UI
XML
Custom
Code
Execution Service
Cascading
Framework
Development Environment
DRIVEN PROVIDES OPERATIONAL READINESS TOETL WORKLOADS
PERFORMANCE MANAGEMENT FOR
BIG DATA APPLICATIONS
higher quality
big data apps
BUILD
big data apps
more reliably
RUN
big data apps
more effectively
MANAGE
BUILDHIGHER QUALITY BIGDATA APPS
9
SOURCES OPERATIONS
(Functions, filters, joins, and aggregators)
RESULTS
Fully visualize your entire data pipeline Quickly and easily identify execution errors
10
BUILDHIGHER QUALITY BIGDATA APPS
Fully visualize your entire data pipeline Quickly and easily identify execution errors
RUN BIGDATA APPS MORE RELIABLY
11
CURRENTLY EXECUTING
Watch your apps execute in real time
Easily detect apps that violate SLA’s and
policies
Pinpoint bottlenecks and identify causes
RUN BIGDATA APPS MORE RELIABLY
12
Pinpoint bottlenecks and
identify causes
EXECUTING WAITING
Watch your apps execute in real time
Easily detect apps that violate SLA’s and
policies
Pinpoint bottlenecks and identify causes
DETAILED MAPPER/REDUCER STATS
RUN BIGDATA APPS MORE RELIABLY
13
Pinpoint bottlenecks and
identify causes
Watch your apps execute in real time
Easily detect apps that violate SLA’s and
policies
Pinpoint bottlenecks and identify causes
For example, see metrics for all apps
on the production cluster that failed to
execute in under 5 minutes…
…or all applications that use more than
their allotment of mappers
MANAGE BIGDATA APPSMORE EFFECTIVELY
14
See how all apps consume resources as they run
Compare performance, resource consumption, and other metrics
across departments, teams and any segment you define
MANAGE BIGDATA APPSMORE EFFECTIVELY
15
See how all apps consume resources as they run
Segment performance by team, by department or custom tags for
role-based views, chargeback models, and capacity planning
For example, see performance of all apps
owned by the DevOps team
Marketing Sales Compliance Data science team QA cluster Production cluster
MANAGE BIGDATA APPSFOR COMPLIANCE
16
Visualize Lineage – See exactly how each app ingests, manipulates
and outputs data
Further inspect lineage by detecting apps that write to, or read from, a
given dataset
SOURCES OPERATIONS
(Functions, filters, joins, and aggregators)
RESULTS
MANAGE BIGDATA APPSFOR COMPLIANCE
17
Visualize Lineage – See exactly how each app ingests, manipulates
and outputs data
Further inspect lineage by detecting apps that write to, or read from, a
given dataset
For example, show all apps that interact
with the dataset in “rain.txt”
MANAGE BIGDATA APPSFOR COLLABORATION
18
Create JIRA issues with views and data for quickly collaborating to
resolve performance problems
Integrate alerts with popular notification platforms like HipChat,
PagerDuty, & Nagios
With one click, create a Jira issue with a
link to this view
MANAGE BIGDATA APPSFOR COLLABORATION
19
Create JIRA issues with views and data for quickly collaborating to
resolve performance problems
Integrate alerts with popular notification platforms like HipChat,
PagerDuty, & Nagios
Automatically send app status
notifications via webhooks or JMX
NURTUREACULTUREOFOPERATIONALEXCELLENCENURTUREACULTUREOFOPERATIONALEXCELLENCE
“The coolest part about Driven
is being able to visualize data
pipelines and inspect
components in real time for
easy troubleshooting and
optimization. I don't know of
any other tool that's close in
functionality.”
- Neville Li
Software Engineer, Spotify
20
”Driven has given us a way to
monitor the performance of
our data-driven applications in
a manner which is visually
intuitive to both engineering and
business users.”
- Joao Vicente
Performance Architect Dun &
Bradstreet
End-to-end operational telemetry metadata for big data applications
Accessible via Web browser, command-line interface (CLI), or simple search queries
Easy integrations through JMX and upcoming Driven SDK
…THROUGH ASCALABLE, SEARCHABLE METADATA STORE
Telemetry metadata
(SSL)
YARNYARN
HADOOP APPS AND INFRASTRUCTURE
APPLICATIONS
Plugin
21
HADOOP CLUSTERS
WARfiles
Web App
Server
Server
Web CLI JMX
Web App
Server
THANKYOU

More Related Content

What's hot

Zurich: Monitoring a sales force-based insurance application using dynatrace ...
Zurich: Monitoring a sales force-based insurance application using dynatrace ...Zurich: Monitoring a sales force-based insurance application using dynatrace ...
Zurich: Monitoring a sales force-based insurance application using dynatrace ...Dynatrace
 
Project Management Tools Native on the Salesforce Platform
Project Management Tools Native on the Salesforce PlatformProject Management Tools Native on the Salesforce Platform
Project Management Tools Native on the Salesforce PlatformProductivity Fox
 
Monoliths to microservices workshop
Monoliths to microservices workshopMonoliths to microservices workshop
Monoliths to microservices workshopJudy Breedlove
 
The Streaming Assessment – An Introduction
The Streaming Assessment – An IntroductionThe Streaming Assessment – An Introduction
The Streaming Assessment – An Introductionconfluent
 
The Three Pillars of Agile Integration: Connector, Container & API
The Three Pillars of Agile Integration: Connector, Container & APIThe Three Pillars of Agile Integration: Connector, Container & API
The Three Pillars of Agile Integration: Connector, Container & APIJudy Breedlove
 
Apache Spark: The modern data analytics platform
Apache Spark: The modern data analytics platformApache Spark: The modern data analytics platform
Apache Spark: The modern data analytics platformMáté Gulyás
 
Microservices, containers and event driven architecture - key factors in agil...
Microservices, containers and event driven architecture - key factors in agil...Microservices, containers and event driven architecture - key factors in agil...
Microservices, containers and event driven architecture - key factors in agil...Judy Breedlove
 
Starbucks: Building a new dev culture and freeing time for innovation: A Star...
Starbucks: Building a new dev culture and freeing time for innovation: A Star...Starbucks: Building a new dev culture and freeing time for innovation: A Star...
Starbucks: Building a new dev culture and freeing time for innovation: A Star...Dynatrace
 
Agile integration: Decomposing the monolith
Agile integration: Decomposing the monolith Agile integration: Decomposing the monolith
Agile integration: Decomposing the monolith Judy Breedlove
 
The 3 pillars of agile integration: Container, Connector and API
The 3 pillars of agile integration:  Container, Connector and APIThe 3 pillars of agile integration:  Container, Connector and API
The 3 pillars of agile integration: Container, Connector and APIJudy Breedlove
 
From sensor data processing to proactive alerting and ai software ag - misja ...
From sensor data processing to proactive alerting and ai software ag - misja ...From sensor data processing to proactive alerting and ai software ag - misja ...
From sensor data processing to proactive alerting and ai software ag - misja ...Capgemini
 
Dynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving IT
Dynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving ITDynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving IT
Dynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving ITDynatrace
 
Introduction to red hat agile integration (Red Hat Workshop)
Introduction to red hat agile integration (Red Hat Workshop)Introduction to red hat agile integration (Red Hat Workshop)
Introduction to red hat agile integration (Red Hat Workshop)Judy Breedlove
 
Humana digitally transforming health and well-being with Pivotal cloud foundr...
Humana digitally transforming health and well-being with Pivotal cloud foundr...Humana digitally transforming health and well-being with Pivotal cloud foundr...
Humana digitally transforming health and well-being with Pivotal cloud foundr...Dynatrace
 
Preparing your organization for microservices
Preparing your organization for microservicesPreparing your organization for microservices
Preparing your organization for microservicesJudy Breedlove
 
How to Execute a Successful API Strategy
How to Execute a Successful API StrategyHow to Execute a Successful API Strategy
How to Execute a Successful API StrategyMatt McLarty
 
Meetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunen
Meetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunenMeetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunen
Meetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunenDigipolis Antwerpen
 
Agile Integration with APIs and Containers Workshop
Agile Integration with APIs and Containers Workshop Agile Integration with APIs and Containers Workshop
Agile Integration with APIs and Containers Workshop Nicole Maselli
 
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Dataconfluent
 
OpsRamp Platform Winter 2020 Release
OpsRamp Platform Winter 2020 ReleaseOpsRamp Platform Winter 2020 Release
OpsRamp Platform Winter 2020 ReleaseOpsRamp
 

What's hot (20)

Zurich: Monitoring a sales force-based insurance application using dynatrace ...
Zurich: Monitoring a sales force-based insurance application using dynatrace ...Zurich: Monitoring a sales force-based insurance application using dynatrace ...
Zurich: Monitoring a sales force-based insurance application using dynatrace ...
 
Project Management Tools Native on the Salesforce Platform
Project Management Tools Native on the Salesforce PlatformProject Management Tools Native on the Salesforce Platform
Project Management Tools Native on the Salesforce Platform
 
Monoliths to microservices workshop
Monoliths to microservices workshopMonoliths to microservices workshop
Monoliths to microservices workshop
 
The Streaming Assessment – An Introduction
The Streaming Assessment – An IntroductionThe Streaming Assessment – An Introduction
The Streaming Assessment – An Introduction
 
The Three Pillars of Agile Integration: Connector, Container & API
The Three Pillars of Agile Integration: Connector, Container & APIThe Three Pillars of Agile Integration: Connector, Container & API
The Three Pillars of Agile Integration: Connector, Container & API
 
Apache Spark: The modern data analytics platform
Apache Spark: The modern data analytics platformApache Spark: The modern data analytics platform
Apache Spark: The modern data analytics platform
 
Microservices, containers and event driven architecture - key factors in agil...
Microservices, containers and event driven architecture - key factors in agil...Microservices, containers and event driven architecture - key factors in agil...
Microservices, containers and event driven architecture - key factors in agil...
 
Starbucks: Building a new dev culture and freeing time for innovation: A Star...
Starbucks: Building a new dev culture and freeing time for innovation: A Star...Starbucks: Building a new dev culture and freeing time for innovation: A Star...
Starbucks: Building a new dev culture and freeing time for innovation: A Star...
 
Agile integration: Decomposing the monolith
Agile integration: Decomposing the monolith Agile integration: Decomposing the monolith
Agile integration: Decomposing the monolith
 
The 3 pillars of agile integration: Container, Connector and API
The 3 pillars of agile integration:  Container, Connector and APIThe 3 pillars of agile integration:  Container, Connector and API
The 3 pillars of agile integration: Container, Connector and API
 
From sensor data processing to proactive alerting and ai software ag - misja ...
From sensor data processing to proactive alerting and ai software ag - misja ...From sensor data processing to proactive alerting and ai software ag - misja ...
From sensor data processing to proactive alerting and ai software ag - misja ...
 
Dynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving IT
Dynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving ITDynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving IT
Dynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving IT
 
Introduction to red hat agile integration (Red Hat Workshop)
Introduction to red hat agile integration (Red Hat Workshop)Introduction to red hat agile integration (Red Hat Workshop)
Introduction to red hat agile integration (Red Hat Workshop)
 
Humana digitally transforming health and well-being with Pivotal cloud foundr...
Humana digitally transforming health and well-being with Pivotal cloud foundr...Humana digitally transforming health and well-being with Pivotal cloud foundr...
Humana digitally transforming health and well-being with Pivotal cloud foundr...
 
Preparing your organization for microservices
Preparing your organization for microservicesPreparing your organization for microservices
Preparing your organization for microservices
 
How to Execute a Successful API Strategy
How to Execute a Successful API StrategyHow to Execute a Successful API Strategy
How to Execute a Successful API Strategy
 
Meetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunen
Meetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunenMeetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunen
Meetup 27/6/2018: AIOPS om de uitdagingen van een slimme stad te ondersteunen
 
Agile Integration with APIs and Containers Workshop
Agile Integration with APIs and Containers Workshop Agile Integration with APIs and Containers Workshop
Agile Integration with APIs and Containers Workshop
 
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data
 
OpsRamp Platform Winter 2020 Release
OpsRamp Platform Winter 2020 ReleaseOpsRamp Platform Winter 2020 Release
OpsRamp Platform Winter 2020 Release
 

Similar to OffloadETLonHadoopwithDriven

How To Get Hadoop App Intelligence with Driven
How To Get Hadoop App Intelligence with DrivenHow To Get Hadoop App Intelligence with Driven
How To Get Hadoop App Intelligence with DrivenCascading
 
Re-Inventing Enterprise IT Around APIs & Apps
Re-Inventing Enterprise IT Around APIs & AppsRe-Inventing Enterprise IT Around APIs & Apps
Re-Inventing Enterprise IT Around APIs & AppsWSO2
 
ACE NYC November 2021 Slideshare Deck
ACE NYC November 2021 Slideshare DeckACE NYC November 2021 Slideshare Deck
ACE NYC November 2021 Slideshare DeckAUGNYC
 
OrangeScrum and WakeUpSales- Two Powerful SaaS Products of Andolasoft
OrangeScrum and WakeUpSales- Two Powerful SaaS Products of AndolasoftOrangeScrum and WakeUpSales- Two Powerful SaaS Products of Andolasoft
OrangeScrum and WakeUpSales- Two Powerful SaaS Products of AndolasoftAndolasoft Inc
 
Digital Business with SAP B1 - Introduction
Digital Business with SAP B1 - IntroductionDigital Business with SAP B1 - Introduction
Digital Business with SAP B1 - Introductionjzelynlim95
 
Cloud Power Series for CFO's - salesforce
Cloud Power Series for CFO's - salesforceCloud Power Series for CFO's - salesforce
Cloud Power Series for CFO's - salesforceNate Skinner
 
Envisioning the Future Enterprise
Envisioning the Future EnterpriseEnvisioning the Future Enterprise
Envisioning the Future Enterprise WSO2
 
Analytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old ConstraintsAnalytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old ConstraintsInside Analysis
 
Learn Best Practices of a True Hybrid IT Management Approach
Learn Best Practices of a True Hybrid IT Management ApproachLearn Best Practices of a True Hybrid IT Management Approach
Learn Best Practices of a True Hybrid IT Management ApproachEnterprise Management Associates
 
Monitor and Measure Your Way to Successful Digital Transformation
Monitor and Measure Your Way to Successful Digital TransformationMonitor and Measure Your Way to Successful Digital Transformation
Monitor and Measure Your Way to Successful Digital TransformationVMware Tanzu
 
Digital Platfrom 4 Summary
Digital Platfrom 4 SummaryDigital Platfrom 4 Summary
Digital Platfrom 4 SummaryIan Thomas
 
Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...
Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...
Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...CA Technologies
 
7 Best Practices for Achieving Operational Readiness on Hadoop with Driven an...
7 Best Practices for Achieving Operational Readiness on Hadoop with Driven an...7 Best Practices for Achieving Operational Readiness on Hadoop with Driven an...
7 Best Practices for Achieving Operational Readiness on Hadoop with Driven an...Cascading
 
Profiting from the Digital Shift: Time Series Databases as Value Creation Eng...
Profiting from the Digital Shift: Time Series Databases as Value Creation Eng...Profiting from the Digital Shift: Time Series Databases as Value Creation Eng...
Profiting from the Digital Shift: Time Series Databases as Value Creation Eng...Enterprise Management Associates
 
IEEE-SCCPresentation.290214544
IEEE-SCCPresentation.290214544IEEE-SCCPresentation.290214544
IEEE-SCCPresentation.290214544ypai
 
Unleash the Potential of Big Data on Salesforce
Unleash the Potential of Big Data on SalesforceUnleash the Potential of Big Data on Salesforce
Unleash the Potential of Big Data on SalesforceDreamforce
 
Hadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both WorldsHadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both WorldsInside Analysis
 

Similar to OffloadETLonHadoopwithDriven (20)

How To Get Hadoop App Intelligence with Driven
How To Get Hadoop App Intelligence with DrivenHow To Get Hadoop App Intelligence with Driven
How To Get Hadoop App Intelligence with Driven
 
Re-Inventing Enterprise IT Around APIs & Apps
Re-Inventing Enterprise IT Around APIs & AppsRe-Inventing Enterprise IT Around APIs & Apps
Re-Inventing Enterprise IT Around APIs & Apps
 
ACE NYC November 2021 Slideshare Deck
ACE NYC November 2021 Slideshare DeckACE NYC November 2021 Slideshare Deck
ACE NYC November 2021 Slideshare Deck
 
OrangeScrum and WakeUpSales- Two Powerful SaaS Products of Andolasoft
OrangeScrum and WakeUpSales- Two Powerful SaaS Products of AndolasoftOrangeScrum and WakeUpSales- Two Powerful SaaS Products of Andolasoft
OrangeScrum and WakeUpSales- Two Powerful SaaS Products of Andolasoft
 
Business App Bootcamp
Business App BootcampBusiness App Bootcamp
Business App Bootcamp
 
Digital Business with SAP B1 - Introduction
Digital Business with SAP B1 - IntroductionDigital Business with SAP B1 - Introduction
Digital Business with SAP B1 - Introduction
 
SXSW Business App Bootcamp
SXSW Business App BootcampSXSW Business App Bootcamp
SXSW Business App Bootcamp
 
Cloud Power Series for CFO's - salesforce
Cloud Power Series for CFO's - salesforceCloud Power Series for CFO's - salesforce
Cloud Power Series for CFO's - salesforce
 
Envisioning the Future Enterprise
Envisioning the Future EnterpriseEnvisioning the Future Enterprise
Envisioning the Future Enterprise
 
Analytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old ConstraintsAnalytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old Constraints
 
Learn Best Practices of a True Hybrid IT Management Approach
Learn Best Practices of a True Hybrid IT Management ApproachLearn Best Practices of a True Hybrid IT Management Approach
Learn Best Practices of a True Hybrid IT Management Approach
 
Monitor and Measure Your Way to Successful Digital Transformation
Monitor and Measure Your Way to Successful Digital TransformationMonitor and Measure Your Way to Successful Digital Transformation
Monitor and Measure Your Way to Successful Digital Transformation
 
Digital Platfrom 4 Summary
Digital Platfrom 4 SummaryDigital Platfrom 4 Summary
Digital Platfrom 4 Summary
 
Greetings david cutler inform and connect
Greetings   david cutler inform and connectGreetings   david cutler inform and connect
Greetings david cutler inform and connect
 
Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...
Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...
Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...
 
7 Best Practices for Achieving Operational Readiness on Hadoop with Driven an...
7 Best Practices for Achieving Operational Readiness on Hadoop with Driven an...7 Best Practices for Achieving Operational Readiness on Hadoop with Driven an...
7 Best Practices for Achieving Operational Readiness on Hadoop with Driven an...
 
Profiting from the Digital Shift: Time Series Databases as Value Creation Eng...
Profiting from the Digital Shift: Time Series Databases as Value Creation Eng...Profiting from the Digital Shift: Time Series Databases as Value Creation Eng...
Profiting from the Digital Shift: Time Series Databases as Value Creation Eng...
 
IEEE-SCCPresentation.290214544
IEEE-SCCPresentation.290214544IEEE-SCCPresentation.290214544
IEEE-SCCPresentation.290214544
 
Unleash the Potential of Big Data on Salesforce
Unleash the Potential of Big Data on SalesforceUnleash the Potential of Big Data on Salesforce
Unleash the Potential of Big Data on Salesforce
 
Hadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both WorldsHadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both Worlds
 

More from Cascading

Overview of Cascading 3.0 on Apache Flink
Overview of Cascading 3.0 on Apache Flink Overview of Cascading 3.0 on Apache Flink
Overview of Cascading 3.0 on Apache Flink Cascading
 
Predicting Hospital Readmission Using Cascading
Predicting Hospital Readmission Using CascadingPredicting Hospital Readmission Using Cascading
Predicting Hospital Readmission Using CascadingCascading
 
Reducing Development Time for Production-Grade Hadoop Applications
Reducing Development Time for Production-Grade Hadoop ApplicationsReducing Development Time for Production-Grade Hadoop Applications
Reducing Development Time for Production-Grade Hadoop ApplicationsCascading
 
Cascading 2015 User Survey Results
Cascading 2015 User Survey ResultsCascading 2015 User Survey Results
Cascading 2015 User Survey ResultsCascading
 
The Cascading (big) data application framework - André Keple, Sr. Engineer, C...
The Cascading (big) data application framework - André Keple, Sr. Engineer, C...The Cascading (big) data application framework - André Keple, Sr. Engineer, C...
The Cascading (big) data application framework - André Keple, Sr. Engineer, C...Cascading
 
Cascading - A Java Developer’s Companion to the Hadoop World
Cascading - A Java Developer’s Companion to the Hadoop WorldCascading - A Java Developer’s Companion to the Hadoop World
Cascading - A Java Developer’s Companion to the Hadoop WorldCascading
 
Elasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingElasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingCascading
 
Cascading concurrent yahoo lunch_nlearn
Cascading concurrent   yahoo lunch_nlearnCascading concurrent   yahoo lunch_nlearn
Cascading concurrent yahoo lunch_nlearnCascading
 
Introduction to Cascading
Introduction to Cascading  Introduction to Cascading
Introduction to Cascading Cascading
 
Accelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with CascadingAccelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with CascadingCascading
 

More from Cascading (10)

Overview of Cascading 3.0 on Apache Flink
Overview of Cascading 3.0 on Apache Flink Overview of Cascading 3.0 on Apache Flink
Overview of Cascading 3.0 on Apache Flink
 
Predicting Hospital Readmission Using Cascading
Predicting Hospital Readmission Using CascadingPredicting Hospital Readmission Using Cascading
Predicting Hospital Readmission Using Cascading
 
Reducing Development Time for Production-Grade Hadoop Applications
Reducing Development Time for Production-Grade Hadoop ApplicationsReducing Development Time for Production-Grade Hadoop Applications
Reducing Development Time for Production-Grade Hadoop Applications
 
Cascading 2015 User Survey Results
Cascading 2015 User Survey ResultsCascading 2015 User Survey Results
Cascading 2015 User Survey Results
 
The Cascading (big) data application framework - André Keple, Sr. Engineer, C...
The Cascading (big) data application framework - André Keple, Sr. Engineer, C...The Cascading (big) data application framework - André Keple, Sr. Engineer, C...
The Cascading (big) data application framework - André Keple, Sr. Engineer, C...
 
Cascading - A Java Developer’s Companion to the Hadoop World
Cascading - A Java Developer’s Companion to the Hadoop WorldCascading - A Java Developer’s Companion to the Hadoop World
Cascading - A Java Developer’s Companion to the Hadoop World
 
Elasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingElasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log Processing
 
Cascading concurrent yahoo lunch_nlearn
Cascading concurrent   yahoo lunch_nlearnCascading concurrent   yahoo lunch_nlearn
Cascading concurrent yahoo lunch_nlearn
 
Introduction to Cascading
Introduction to Cascading  Introduction to Cascading
Introduction to Cascading
 
Accelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with CascadingAccelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with Cascading
 

Recently uploaded

Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 

Recently uploaded (20)

Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 

OffloadETLonHadoopwithDriven

  • 2. ABOUT CONCURRENT 2 TRUSTED by over 10,000 companies as their big data app platform BACKED by top Silicon Valley investors True Ventures, Rembrandt VP, Bain Capital FOUNDED in 2008, with headquarters in San Francisco
  • 3. •Founded in 1995 •HQ in Chicago, IL •Offices in India & Australia •ISO 9001:2008 & ISO 27001:2005 Certified Most experienced data professionals Proprietary frameworks and accelerators for guaranteed, efficient and cost-effective services for data projects ABOUT BITWISE 3
  • 4. ENTERPRISEENGAGEMENTWITHHADOOP ISGAININGDEPTH… 4 Improving brand experience, creating new revenue channels, enhancing operational visibility to risk & compliance, reducing TCO have been the key drivers, engaging at levels of CEO, CIO, CDO
  • 5. Analytics EMERGINGENTERPRISEARCHITECTURE FORHADOOP 5 Reporting Mining Analytics Exploratory Discovery Search Data Mart ReportingData Mining STAGE TRANSFORM ARCHIVE Data Lake
  • 6. CASE STUDY 6 RECOVERY APPLICATIONRECOVERY APPLICATION DATA SOURCE ANALYTICSANALYTICS REPORTINGSREPORTINGS Developer UI XML Custom Code Execution Service Cascading Framework Generate Cascading Flow Launch MapReduce Jobs On Execution ETL Application ETL Application RECOVERY APPLICATIONRECOVERY APPLICATION DATA SOURCE ANALYTICSANALYTICS REPORTINGSREPORTINGS Automated ETL Conversion RDBMSRDBMS RDBMS Data Quality Monitoring DataQualityMonitoring ETL Testing
  • 8. DRIVEN PROVIDES OPERATIONAL READINESS TOETL WORKLOADS PERFORMANCE MANAGEMENT FOR BIG DATA APPLICATIONS higher quality big data apps BUILD big data apps more reliably RUN big data apps more effectively MANAGE
  • 9. BUILDHIGHER QUALITY BIGDATA APPS 9 SOURCES OPERATIONS (Functions, filters, joins, and aggregators) RESULTS Fully visualize your entire data pipeline Quickly and easily identify execution errors
  • 10. 10 BUILDHIGHER QUALITY BIGDATA APPS Fully visualize your entire data pipeline Quickly and easily identify execution errors
  • 11. RUN BIGDATA APPS MORE RELIABLY 11 CURRENTLY EXECUTING Watch your apps execute in real time Easily detect apps that violate SLA’s and policies Pinpoint bottlenecks and identify causes
  • 12. RUN BIGDATA APPS MORE RELIABLY 12 Pinpoint bottlenecks and identify causes EXECUTING WAITING Watch your apps execute in real time Easily detect apps that violate SLA’s and policies Pinpoint bottlenecks and identify causes DETAILED MAPPER/REDUCER STATS
  • 13. RUN BIGDATA APPS MORE RELIABLY 13 Pinpoint bottlenecks and identify causes Watch your apps execute in real time Easily detect apps that violate SLA’s and policies Pinpoint bottlenecks and identify causes For example, see metrics for all apps on the production cluster that failed to execute in under 5 minutes… …or all applications that use more than their allotment of mappers
  • 14. MANAGE BIGDATA APPSMORE EFFECTIVELY 14 See how all apps consume resources as they run Compare performance, resource consumption, and other metrics across departments, teams and any segment you define
  • 15. MANAGE BIGDATA APPSMORE EFFECTIVELY 15 See how all apps consume resources as they run Segment performance by team, by department or custom tags for role-based views, chargeback models, and capacity planning For example, see performance of all apps owned by the DevOps team Marketing Sales Compliance Data science team QA cluster Production cluster
  • 16. MANAGE BIGDATA APPSFOR COMPLIANCE 16 Visualize Lineage – See exactly how each app ingests, manipulates and outputs data Further inspect lineage by detecting apps that write to, or read from, a given dataset SOURCES OPERATIONS (Functions, filters, joins, and aggregators) RESULTS
  • 17. MANAGE BIGDATA APPSFOR COMPLIANCE 17 Visualize Lineage – See exactly how each app ingests, manipulates and outputs data Further inspect lineage by detecting apps that write to, or read from, a given dataset For example, show all apps that interact with the dataset in “rain.txt”
  • 18. MANAGE BIGDATA APPSFOR COLLABORATION 18 Create JIRA issues with views and data for quickly collaborating to resolve performance problems Integrate alerts with popular notification platforms like HipChat, PagerDuty, & Nagios With one click, create a Jira issue with a link to this view
  • 19. MANAGE BIGDATA APPSFOR COLLABORATION 19 Create JIRA issues with views and data for quickly collaborating to resolve performance problems Integrate alerts with popular notification platforms like HipChat, PagerDuty, & Nagios Automatically send app status notifications via webhooks or JMX
  • 20. NURTUREACULTUREOFOPERATIONALEXCELLENCENURTUREACULTUREOFOPERATIONALEXCELLENCE “The coolest part about Driven is being able to visualize data pipelines and inspect components in real time for easy troubleshooting and optimization. I don't know of any other tool that's close in functionality.” - Neville Li Software Engineer, Spotify 20 ”Driven has given us a way to monitor the performance of our data-driven applications in a manner which is visually intuitive to both engineering and business users.” - Joao Vicente Performance Architect Dun & Bradstreet
  • 21. End-to-end operational telemetry metadata for big data applications Accessible via Web browser, command-line interface (CLI), or simple search queries Easy integrations through JMX and upcoming Driven SDK …THROUGH ASCALABLE, SEARCHABLE METADATA STORE Telemetry metadata (SSL) YARNYARN HADOOP APPS AND INFRASTRUCTURE APPLICATIONS Plugin 21 HADOOP CLUSTERS WARfiles Web App Server Server Web CLI JMX Web App Server

Editor's Notes

  1. CEO Improved customer & brand experience New product channels Enhanced operational visibility Enhanced enforcement of compliance & regulations CIO Developing a business case for Hadoop; developing ROI metrics Developing an inventory of data assets on Hadoop Promote data re use & ensure integrity of data feeds Ensure infrastructure governance CDO Ensure continuity of data best practices on Hadoop Develop & enforce regulatory protocols Promote principles of data library C_SUITE Monetize new know-how for service offerings Create better customer experiences Increase mind- & wallet-share