SlideShare a Scribd company logo
© 2014 Impetus Technologies1
July 25, 2014
Accelerating the Big Data Solution
Lifecycle and Improving ROI
© 2014 Impetus Technologies2
Agenda
Big Data
Analytics:
Implementation
patterns
Challenges
faced
Jumbune –
an open source
lifecycle
accelerator
Enterprise
solution
lifecycle
Ways to
address the
challenges
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies3
Big Data Analytics
Primary drive
for performing
analytics
Rise of the
enterprise
data lake
Utilization of
analytical
resources
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies4
Primary Purposes of an Analytical Solution
Optimize the
business
Reduce time
taken by analytics
Result in effective
analytics
Compete and
win
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies5
Rise of the Enterprise Data Lake
BIG DATA
Sources of Data: ETL from every
source - RDBMS, flat files, queues,
legacy off loading, logs
Arrival of Data: Intermittent, bulk,
incremental
Theme : “Leave no Data unused”
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies6
Utilization of Analytics Resources
• Capitalize on all analytics resources (engines) available
• Access data with a variety of processing engines – Storm, Spark, Yarn
etc.
• Model in data science analytical systems – R, Octave, SAS, etc.
• Write complex logic in custom MapReduce
• Reuse code as User Defined Functions (UDFs)
• Create ad hoc queries using Hive and PIG
• Customization of Mahout algorithms, machine learning libraries
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies7
Enterprise Big Data Solution Trends
• No more single purpose Hadoop clusters
• Enterprise Data Lake: Data flowing from many sources
• Integrated platforms using variety of analytical engines
• Serving multiple business applications
• Resource sharing is a must across applications and engines
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies8
Enterprise Solution Lifecycle (High level view)
Business
Requirement
Designing /
Modelling
Development
and Testing
Production and
Monitoring
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies9
Enterprise Solution Lifecycle (Ground level
view)
xxx
xxx
Business User Data Analyst Development
Quality Test
DevOpsData Lake
Production and
Monitoring
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies10
Challenges in Enterprise Analytical Solutions
No common
platform to detect
root causes
Incremental
imports may ingest
bad data
Cluster resources are
shared and optimal
utilization is the key
Implementing
models in custom
MR without errors
is like hitting the
bull’s eye
Bad logic or bad
data
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies11
Scenario: Digitization of Newspaper for
Analyzing News
xxx
xxx
Team: 5 Dev, 3 QA, 2 DevOps
Simple Problem: ‘q’ was misread by OCR as 9
TIME
• A single code fault on TB of data can consume 24 work hours total
for 2 Developers + 1 QA
COST
• Additional hours by engineers + The cost of unproductive cloud
instances, storage and resources
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies12
Scenario: Hive Queries Interpreted as
MapReduce Executions on a Hadoop Cluster
xxx
xxx
Team: 2 Dev, 1 QA, 1 DevOps
Simple Problem: Data imbalance across cluster, low performance by Hive queries.
TIME
• Development team were refactoring Hive queries for improving the
performance
COST
• Additional hours by engineers + The cost of unproductive cloud
instances, storage and resources
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies13
Impact on ROI
Delayed Analytics Increase in CostsProductivity Loss
Defeats one of the
prime purpose of
analytics
Defeats the purpose
of business cost
optimization
Iterations reduce the
productivity of
dependent teams in
the cycle
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies14
Current Iterative Development Approach
Local
Debug/ Unit
Tests
HDFS Data Check
Performance
• Localized subset
of data
• Non parallel
execution
• Practically
unfeasible
• Error prone
• Difficult to find
bad code
• Difficult to
collaborate
across
environments
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies15
A Complete Enterprise Platform
Data Lake
Enterprise Engines
Solutions
Governance
Security
Validate,Profile,DebugandMonitor
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies16
Introducing Jumbune: An Open Source
Solution
“A catalyst to accelerate realization of Big Data Analytics
solutions”
Flow AnalyzerData Validation Cluster Monitor Job Profiler
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies17
© 2014 Impetus Technologies18
© 2014 Impetus Technologies19
© 2014 Impetus Technologies20
© 2014 Impetus Technologies21
Full Lifecycle Support - Jumbune
xxx
xxx
Development Quality
DevOpsData Ingestion
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies22
Jumbune - Key Features
• In depth code level analysis of cluster wide flow
• Record and field level data violation reports
• No deployment on worker nodes - Ultra light agent installation on the gateway node
• Ability to turn on/off cluster monitoring at will – reduces resource load
• Customizable rack aware monitoring
• Correlated profiling analysis of phases, throughput and resource consumption
• Ability to work with all Hadoop distributions
• Coming up support for Yarn, Spark, Mesos
• Available as Open Source
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies23
2
3
For general inquiries about other Impetus solutions and services
reach us at bigdata@impetus.com
Recorded version available at http://bit.ly/1nMw8nQ
© 2014 Impetus Technologies24
Thank You!
Website
• http://jumbune.org
Contribute
• http://github.com/impetus-opensource/jumbune
• http://jumbune.org/jira/JUM
Social
• Follow @jumbune Use #jumbune
• Jumbune Group: http://linkd.in/1mUmcYm
Forums
• Users: users-subscribe@collaborate.jumbune.org
• Dev: dev-subscribe@collaborate.jumbune.org
• Issues: issues-subscribe@collaborate.jumbune.org
Downloads
• http://jumbune.org
• https://bintray.com/jumbune/downloads/jumbune
Recorded version available at http://bit.ly/1nMw8nQ

More Related Content

Similar to Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand Webcast

Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Impetus Technologies
 
Performance Testing of Large-scale Systems- Impetus Webinar
Performance Testing of Large-scale Systems- Impetus WebinarPerformance Testing of Large-scale Systems- Impetus Webinar
Performance Testing of Large-scale Systems- Impetus Webinar
Impetus Technologies
 
Adopting DevOps for 2-Speed IT
Adopting DevOps for 2-Speed ITAdopting DevOps for 2-Speed IT
Adopting DevOps for 2-Speed IT
IBM UrbanCode Products
 
Enterprise CI as-a-Service using Jenkins
Enterprise CI as-a-Service using JenkinsEnterprise CI as-a-Service using Jenkins
Enterprise CI as-a-Service using Jenkins
CollabNet
 
Il paradigma DevOps e Continuous Delivery Automation
Il paradigma DevOps e Continuous Delivery Automation Il paradigma DevOps e Continuous Delivery Automation
Il paradigma DevOps e Continuous Delivery Automation
HP Enterprise Italia
 
Comment déployer une stratégie microsoft en mode appliance
Comment déployer une stratégie microsoft en mode applianceComment déployer une stratégie microsoft en mode appliance
Comment déployer une stratégie microsoft en mode appliance
Microsoft Ideas
 
Smarter z/OS Software Delivery using Rational Enterprise Cloud Solutions
Smarter z/OS Software Delivery using Rational Enterprise Cloud SolutionsSmarter z/OS Software Delivery using Rational Enterprise Cloud Solutions
Smarter z/OS Software Delivery using Rational Enterprise Cloud Solutions
Jean-Yves Rigolet
 
Accelerating SDLC for Large Public Sector Enterprise Applications
Accelerating SDLC for Large Public Sector Enterprise ApplicationsAccelerating SDLC for Large Public Sector Enterprise Applications
Accelerating SDLC for Large Public Sector Enterprise Applications
Splunk
 
IoT Scale Event-Stream Processing for Connected Fleet at Penske
IoT Scale Event-Stream Processing for Connected Fleet at PenskeIoT Scale Event-Stream Processing for Connected Fleet at Penske
IoT Scale Event-Stream Processing for Connected Fleet at Penske
VMware Tanzu
 
A Reference Architecture to Enable Visibility and Traceability across the Ent...
A Reference Architecture to Enable Visibility and Traceability across the Ent...A Reference Architecture to Enable Visibility and Traceability across the Ent...
A Reference Architecture to Enable Visibility and Traceability across the Ent...
CollabNet
 
The Path to a Pain-Free Control System Upgrade
The Path to a Pain-Free Control System UpgradeThe Path to a Pain-Free Control System Upgrade
The Path to a Pain-Free Control System Upgrade
Inductive Automation
 
Does Big Data Spell Big Costs- Impetus Webinar
Does Big Data Spell Big Costs- Impetus WebinarDoes Big Data Spell Big Costs- Impetus Webinar
Does Big Data Spell Big Costs- Impetus Webinar
Impetus Technologies
 
Pure App + Patterns + Prolifics = Feeding Change
Pure App + Patterns + Prolifics = Feeding Change Pure App + Patterns + Prolifics = Feeding Change
Pure App + Patterns + Prolifics = Feeding Change
Prolifics
 
Test Automation in Agile
Test Automation in AgileTest Automation in Agile
Test Automation in Agile
Agile Testing Alliance
 
Top 5 .NET Challenges, Performance Monitoring Tips & Tricks
Top 5 .NET Challenges, Performance Monitoring Tips & TricksTop 5 .NET Challenges, Performance Monitoring Tips & Tricks
Top 5 .NET Challenges, Performance Monitoring Tips & Tricks
AppDynamics
 
Deployment Automation for Hybrid Cloud and Multi-Platform Environments
Deployment Automation for Hybrid Cloud and Multi-Platform EnvironmentsDeployment Automation for Hybrid Cloud and Multi-Platform Environments
Deployment Automation for Hybrid Cloud and Multi-Platform Environments
IBM UrbanCode Products
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
DATAVERSITY
 
Planning Your Hadoop NoSQL Projects For 2011
Planning Your Hadoop NoSQL Projects For 2011Planning Your Hadoop NoSQL Projects For 2011
Planning Your Hadoop NoSQL Projects For 2011
Impetus Technologies
 
Enterprise DevOps Transformation
Enterprise DevOps TransformationEnterprise DevOps Transformation
Enterprise DevOps Transformation
Bart Driscoll
 
Spring Mainframe VUG 2015: How to google your way through your mainframe appl...
Spring Mainframe VUG 2015: How to google your way through your mainframe appl...Spring Mainframe VUG 2015: How to google your way through your mainframe appl...
Spring Mainframe VUG 2015: How to google your way through your mainframe appl...
Serena Software
 

Similar to Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand Webcast (20)

Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
 
Performance Testing of Large-scale Systems- Impetus Webinar
Performance Testing of Large-scale Systems- Impetus WebinarPerformance Testing of Large-scale Systems- Impetus Webinar
Performance Testing of Large-scale Systems- Impetus Webinar
 
Adopting DevOps for 2-Speed IT
Adopting DevOps for 2-Speed ITAdopting DevOps for 2-Speed IT
Adopting DevOps for 2-Speed IT
 
Enterprise CI as-a-Service using Jenkins
Enterprise CI as-a-Service using JenkinsEnterprise CI as-a-Service using Jenkins
Enterprise CI as-a-Service using Jenkins
 
Il paradigma DevOps e Continuous Delivery Automation
Il paradigma DevOps e Continuous Delivery Automation Il paradigma DevOps e Continuous Delivery Automation
Il paradigma DevOps e Continuous Delivery Automation
 
Comment déployer une stratégie microsoft en mode appliance
Comment déployer une stratégie microsoft en mode applianceComment déployer une stratégie microsoft en mode appliance
Comment déployer une stratégie microsoft en mode appliance
 
Smarter z/OS Software Delivery using Rational Enterprise Cloud Solutions
Smarter z/OS Software Delivery using Rational Enterprise Cloud SolutionsSmarter z/OS Software Delivery using Rational Enterprise Cloud Solutions
Smarter z/OS Software Delivery using Rational Enterprise Cloud Solutions
 
Accelerating SDLC for Large Public Sector Enterprise Applications
Accelerating SDLC for Large Public Sector Enterprise ApplicationsAccelerating SDLC for Large Public Sector Enterprise Applications
Accelerating SDLC for Large Public Sector Enterprise Applications
 
IoT Scale Event-Stream Processing for Connected Fleet at Penske
IoT Scale Event-Stream Processing for Connected Fleet at PenskeIoT Scale Event-Stream Processing for Connected Fleet at Penske
IoT Scale Event-Stream Processing for Connected Fleet at Penske
 
A Reference Architecture to Enable Visibility and Traceability across the Ent...
A Reference Architecture to Enable Visibility and Traceability across the Ent...A Reference Architecture to Enable Visibility and Traceability across the Ent...
A Reference Architecture to Enable Visibility and Traceability across the Ent...
 
The Path to a Pain-Free Control System Upgrade
The Path to a Pain-Free Control System UpgradeThe Path to a Pain-Free Control System Upgrade
The Path to a Pain-Free Control System Upgrade
 
Does Big Data Spell Big Costs- Impetus Webinar
Does Big Data Spell Big Costs- Impetus WebinarDoes Big Data Spell Big Costs- Impetus Webinar
Does Big Data Spell Big Costs- Impetus Webinar
 
Pure App + Patterns + Prolifics = Feeding Change
Pure App + Patterns + Prolifics = Feeding Change Pure App + Patterns + Prolifics = Feeding Change
Pure App + Patterns + Prolifics = Feeding Change
 
Test Automation in Agile
Test Automation in AgileTest Automation in Agile
Test Automation in Agile
 
Top 5 .NET Challenges, Performance Monitoring Tips & Tricks
Top 5 .NET Challenges, Performance Monitoring Tips & TricksTop 5 .NET Challenges, Performance Monitoring Tips & Tricks
Top 5 .NET Challenges, Performance Monitoring Tips & Tricks
 
Deployment Automation for Hybrid Cloud and Multi-Platform Environments
Deployment Automation for Hybrid Cloud and Multi-Platform EnvironmentsDeployment Automation for Hybrid Cloud and Multi-Platform Environments
Deployment Automation for Hybrid Cloud and Multi-Platform Environments
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
 
Planning Your Hadoop NoSQL Projects For 2011
Planning Your Hadoop NoSQL Projects For 2011Planning Your Hadoop NoSQL Projects For 2011
Planning Your Hadoop NoSQL Projects For 2011
 
Enterprise DevOps Transformation
Enterprise DevOps TransformationEnterprise DevOps Transformation
Enterprise DevOps Transformation
 
Spring Mainframe VUG 2015: How to google your way through your mainframe appl...
Spring Mainframe VUG 2015: How to google your way through your mainframe appl...Spring Mainframe VUG 2015: How to google your way through your mainframe appl...
Spring Mainframe VUG 2015: How to google your way through your mainframe appl...
 

More from Impetus Technologies

Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Impetus Technologies
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Impetus Technologies
 
Building Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus WebinarBuilding Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus Webinar
Impetus Technologies
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Impetus Technologies
 
Impetus White Paper- Handling Data Corruption in Elasticsearch
Impetus White Paper- Handling  Data Corruption  in ElasticsearchImpetus White Paper- Handling  Data Corruption  in Elasticsearch
Impetus White Paper- Handling Data Corruption in Elasticsearch
Impetus Technologies
 
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Impetus Technologies
 
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Impetus Technologies
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Impetus Technologies
 
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
Impetus Technologies
 
Enterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus WebcastEnterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus Webcast
Impetus Technologies
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Impetus Technologies
 
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Impetus Technologies
 
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Impetus Technologies
 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLab
Impetus Technologies
 
Webinar maturity of mobile test automation- approaches and future trends
Webinar  maturity of mobile test automation- approaches and future trendsWebinar  maturity of mobile test automation- approaches and future trends
Webinar maturity of mobile test automation- approaches and future trendsImpetus Technologies
 
Next generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labNext generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph lab
Impetus Technologies
 
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
Impetus Technologies
 
Performance Testing of Big Data Applications - Impetus Webcast
Performance Testing of Big Data Applications - Impetus WebcastPerformance Testing of Big Data Applications - Impetus Webcast
Performance Testing of Big Data Applications - Impetus Webcast
Impetus Technologies
 
Real-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus WebinarReal-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus Webinar
Impetus Technologies
 
Webinar real-time predictive analytics in manufacturing
Webinar  real-time predictive analytics in manufacturingWebinar  real-time predictive analytics in manufacturing
Webinar real-time predictive analytics in manufacturing
Impetus Technologies
 

More from Impetus Technologies (20)

Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
 
Building Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus WebinarBuilding Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus Webinar
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
 
Impetus White Paper- Handling Data Corruption in Elasticsearch
Impetus White Paper- Handling  Data Corruption  in ElasticsearchImpetus White Paper- Handling  Data Corruption  in Elasticsearch
Impetus White Paper- Handling Data Corruption in Elasticsearch
 
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
 
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
 
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
 
Enterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus WebcastEnterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus Webcast
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
 
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
 
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLab
 
Webinar maturity of mobile test automation- approaches and future trends
Webinar  maturity of mobile test automation- approaches and future trendsWebinar  maturity of mobile test automation- approaches and future trends
Webinar maturity of mobile test automation- approaches and future trends
 
Next generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labNext generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph lab
 
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
 
Performance Testing of Big Data Applications - Impetus Webcast
Performance Testing of Big Data Applications - Impetus WebcastPerformance Testing of Big Data Applications - Impetus Webcast
Performance Testing of Big Data Applications - Impetus Webcast
 
Real-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus WebinarReal-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus Webinar
 
Webinar real-time predictive analytics in manufacturing
Webinar  real-time predictive analytics in manufacturingWebinar  real-time predictive analytics in manufacturing
Webinar real-time predictive analytics in manufacturing
 

Recently uploaded

To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 

Recently uploaded (20)

To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 

Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand Webcast

  • 1. © 2014 Impetus Technologies1 July 25, 2014 Accelerating the Big Data Solution Lifecycle and Improving ROI
  • 2. © 2014 Impetus Technologies2 Agenda Big Data Analytics: Implementation patterns Challenges faced Jumbune – an open source lifecycle accelerator Enterprise solution lifecycle Ways to address the challenges Recorded version available at http://bit.ly/1nMw8nQ
  • 3. © 2014 Impetus Technologies3 Big Data Analytics Primary drive for performing analytics Rise of the enterprise data lake Utilization of analytical resources Recorded version available at http://bit.ly/1nMw8nQ
  • 4. © 2014 Impetus Technologies4 Primary Purposes of an Analytical Solution Optimize the business Reduce time taken by analytics Result in effective analytics Compete and win Recorded version available at http://bit.ly/1nMw8nQ
  • 5. © 2014 Impetus Technologies5 Rise of the Enterprise Data Lake BIG DATA Sources of Data: ETL from every source - RDBMS, flat files, queues, legacy off loading, logs Arrival of Data: Intermittent, bulk, incremental Theme : “Leave no Data unused” Recorded version available at http://bit.ly/1nMw8nQ
  • 6. © 2014 Impetus Technologies6 Utilization of Analytics Resources • Capitalize on all analytics resources (engines) available • Access data with a variety of processing engines – Storm, Spark, Yarn etc. • Model in data science analytical systems – R, Octave, SAS, etc. • Write complex logic in custom MapReduce • Reuse code as User Defined Functions (UDFs) • Create ad hoc queries using Hive and PIG • Customization of Mahout algorithms, machine learning libraries Recorded version available at http://bit.ly/1nMw8nQ
  • 7. © 2014 Impetus Technologies7 Enterprise Big Data Solution Trends • No more single purpose Hadoop clusters • Enterprise Data Lake: Data flowing from many sources • Integrated platforms using variety of analytical engines • Serving multiple business applications • Resource sharing is a must across applications and engines Recorded version available at http://bit.ly/1nMw8nQ
  • 8. © 2014 Impetus Technologies8 Enterprise Solution Lifecycle (High level view) Business Requirement Designing / Modelling Development and Testing Production and Monitoring Recorded version available at http://bit.ly/1nMw8nQ
  • 9. © 2014 Impetus Technologies9 Enterprise Solution Lifecycle (Ground level view) xxx xxx Business User Data Analyst Development Quality Test DevOpsData Lake Production and Monitoring Recorded version available at http://bit.ly/1nMw8nQ
  • 10. © 2014 Impetus Technologies10 Challenges in Enterprise Analytical Solutions No common platform to detect root causes Incremental imports may ingest bad data Cluster resources are shared and optimal utilization is the key Implementing models in custom MR without errors is like hitting the bull’s eye Bad logic or bad data Recorded version available at http://bit.ly/1nMw8nQ
  • 11. © 2014 Impetus Technologies11 Scenario: Digitization of Newspaper for Analyzing News xxx xxx Team: 5 Dev, 3 QA, 2 DevOps Simple Problem: ‘q’ was misread by OCR as 9 TIME • A single code fault on TB of data can consume 24 work hours total for 2 Developers + 1 QA COST • Additional hours by engineers + The cost of unproductive cloud instances, storage and resources Recorded version available at http://bit.ly/1nMw8nQ
  • 12. © 2014 Impetus Technologies12 Scenario: Hive Queries Interpreted as MapReduce Executions on a Hadoop Cluster xxx xxx Team: 2 Dev, 1 QA, 1 DevOps Simple Problem: Data imbalance across cluster, low performance by Hive queries. TIME • Development team were refactoring Hive queries for improving the performance COST • Additional hours by engineers + The cost of unproductive cloud instances, storage and resources Recorded version available at http://bit.ly/1nMw8nQ
  • 13. © 2014 Impetus Technologies13 Impact on ROI Delayed Analytics Increase in CostsProductivity Loss Defeats one of the prime purpose of analytics Defeats the purpose of business cost optimization Iterations reduce the productivity of dependent teams in the cycle Recorded version available at http://bit.ly/1nMw8nQ
  • 14. © 2014 Impetus Technologies14 Current Iterative Development Approach Local Debug/ Unit Tests HDFS Data Check Performance • Localized subset of data • Non parallel execution • Practically unfeasible • Error prone • Difficult to find bad code • Difficult to collaborate across environments Recorded version available at http://bit.ly/1nMw8nQ
  • 15. © 2014 Impetus Technologies15 A Complete Enterprise Platform Data Lake Enterprise Engines Solutions Governance Security Validate,Profile,DebugandMonitor Recorded version available at http://bit.ly/1nMw8nQ
  • 16. © 2014 Impetus Technologies16 Introducing Jumbune: An Open Source Solution “A catalyst to accelerate realization of Big Data Analytics solutions” Flow AnalyzerData Validation Cluster Monitor Job Profiler Recorded version available at http://bit.ly/1nMw8nQ
  • 17. © 2014 Impetus Technologies17
  • 18. © 2014 Impetus Technologies18
  • 19. © 2014 Impetus Technologies19
  • 20. © 2014 Impetus Technologies20
  • 21. © 2014 Impetus Technologies21 Full Lifecycle Support - Jumbune xxx xxx Development Quality DevOpsData Ingestion Recorded version available at http://bit.ly/1nMw8nQ
  • 22. © 2014 Impetus Technologies22 Jumbune - Key Features • In depth code level analysis of cluster wide flow • Record and field level data violation reports • No deployment on worker nodes - Ultra light agent installation on the gateway node • Ability to turn on/off cluster monitoring at will – reduces resource load • Customizable rack aware monitoring • Correlated profiling analysis of phases, throughput and resource consumption • Ability to work with all Hadoop distributions • Coming up support for Yarn, Spark, Mesos • Available as Open Source Recorded version available at http://bit.ly/1nMw8nQ
  • 23. © 2014 Impetus Technologies23 2 3 For general inquiries about other Impetus solutions and services reach us at bigdata@impetus.com Recorded version available at http://bit.ly/1nMw8nQ
  • 24. © 2014 Impetus Technologies24 Thank You! Website • http://jumbune.org Contribute • http://github.com/impetus-opensource/jumbune • http://jumbune.org/jira/JUM Social • Follow @jumbune Use #jumbune • Jumbune Group: http://linkd.in/1mUmcYm Forums • Users: users-subscribe@collaborate.jumbune.org • Dev: dev-subscribe@collaborate.jumbune.org • Issues: issues-subscribe@collaborate.jumbune.org Downloads • http://jumbune.org • https://bintray.com/jumbune/downloads/jumbune Recorded version available at http://bit.ly/1nMw8nQ