SlideShare a Scribd company logo
1
Single Point of Truth
The Reality of the Enterprise Data Hub
Justin Sheppard
Ankur Gupta
Sears Holdings Corporation
2
• Not meeting production schedules
• Multiple copies of data, no single point of truth
• ETL complexity, cost of software and cost to manage
• Time to setup ETL data sources for projects
• Latency in data (up to weeks in some cases)
• Enterprise Data Warehouses unable to handle load
• Mainframe workload over consuming capacity
• IT Budgets not growing – BUT data volumes escalating
Where Did We Start?
What Is Hadoop?
3
Hadoop Distributed
File System (HDFS)
File Sharing & Data
Protection Across
Physical Servers
MapReduce
Fault Tolerant
Distributed
Computing Across
Physical Servers
Flexibility
o A single repository for
storing processing &
analyzing any type of data
(structured and complex)
o Not bound by a single
schema
Scalability
o Scale-out architecture divides
workloads across multiple
nodes
o Flexible file system eliminates
ETL bottlenecks
Low Cost
o Can be deployed on
commodity hardware
o Open source platform guards
against vendor lock
Hadoop is a platform for data storage
and processing that is…
o Scalable
o Fault tolerant
o Open source
4
Hadoop
IS
• Store vast amounts of data
• Run queries on huge data
sets
• Ask questions previously
impossible
• Archive data but still
analyze it
• Capture data streams at
incredible speeds
• Massively reduce data
latency
• Transform your thinking
about ETL
Is Not
• High-speed SQL database
• Simple
• Easily connected to legacy
systems
• A replacement for your
current data warehouse
• Going to be built or
operated by your DBA's
• Going to make any sense
to your data architects
• Going to be possible if do
not have Linux skills
5
Use The Right Tool For The Right Job
Databases: Hadoop:
When to use?
• Affordable Storage/Compute
• High-performance queries on large data
• Complex data
• Resilient Auto Scalability
When to use?
• Transactional, High Speed Analytics
• Interactive Reporting (<1sec)
• Multi-step Transactions
• Numerous Inserts/Updates/Deletes
Can be combined
Use The Right Tool For The Right Job
6
Hadoop
Database
Data Hub
7
• Underlying premise as Hadoop adoption continues – source data once, use many.
• Over time, as more and more data is sourced, development times will reduce since data
sourcing is significantly less than typical.
8
Some Examples
Use-cases at Sears Holdings
The First Usage in Production
Use Case
• Interactive presentation layer was required to present item/price/sales data in a highly flexible user
interface with rapid response time
• Needed to deliver solution within a very short period of time.
• Legacy architecture would have required a MicroStrategy solution utilizing 1,000’s of cubes on
many expensive servers
Approach
• Rapid development project initiated to present item/price/sales data in a highly flexible user
interface with rapid response time
• Built system from the ground up
• Migrated all required data to centralized HDFS repository from legacy databases
• Developed MapReduce code to process daily data files into 4 primary data tables
• Tables extracted to service layer (MySQL/Infobrite) for presentation through the Pricing Portal
Results
• File preparation completes in minutes each day and ensures portal data is ready very soon after
daily sales processing completes (100K records daily)
• This was the first production usage of MapReduce and associated technologies – the project
initiated in March and was live on May 9 (<10 weeks concept to realization)
Technologies Used
• Hadoop, Hive, MapReduce, MySql, Infobright, Linux, REST Web Service, Dotnetnuke
9
Learning experience for all parties, successfully demonstrated platform abilities in
production environment – but we would NOT do it this way again…
Mainframe Migration
10
Step 1
Source 1 Source 2
Step 2 Step 3 Step 4 Step 5
Source 3 Source 4
Output
As our experience with Hadoop increased, hypothesis were formed that the
technology could aid with SHC’s mainframe migration initiative.
Example above represents a simply mainframe process
Step 1
Source 1 Source 2
Step 2 Step 3 Step 4 Step 5
Source 3 Source 4
Output
Step 4 Step 5
X X
Migrated sections of mainframe processing, including
data transfer to Hadoop and back, eliminating MIPS
and IMPROVING overall cycle time
ETL Replacement
• A major ongoing system effort in our Marketing department
was heavily reliant on DataStage processing for ETL
– In the early stages of deployment the ETL platform performed within
acceptable limits
– As volume increased the system began to have performance issues as
the ETL platform degraded
– With full rollout imminent, the options were to heavily invest in
additional hardware – or – re-work CPU-intensive portions in Hadoop
11
• Experience with mainframe migration evolved to ETL replacement .
• SHC successfully demonstrated reducing load on costly ETL software with PiG
scripts (and data movement from / to ETL platform as an intermediate step).
• AND with improved processing time…
ETL Replacement
12
• Section in RED – processing duration had far exceeded SLA.
• Using similar approach to mainframe migration, components of the process were migrated to Hadoop (in
PiG).
• Data movement plus processing on Hadoop was more predictable and efficient, regardless of volume, than
prior environment – with no additional investment.
The Journey
• From Legacy (> 1000 lines) to Ruby / MapReduce (400 lines)
– Cryptic code, difficult to support, difficult to train
• We tried HIVE (~400 lines - Sql-like abstraction)
– Easy to use, easy to experiment and test with
– Poor performance, difficult to implement business logic
• We evolved to PiG with Java UDF extensions
– Compressed, very efficient, easy to code / read (~200 lines)
– Demonstrated success in transforming mainframe developers to PiG developers in under 2 weeks
• As we progressed, our business partners requested more and more data from the cluster –
which required developer time
– We are now using Datameer as a business-user reporting and query front-end to the cluster
– Developer for Hadoop, runs efficiently, flexible spreadsheet interface with dashboards
13
We are in a much different place now than when we started our Hadoop journey.
14
The LearningHADOOP
 We can dramatically reduce batch processing times for mainframe and EDW
 We can retain and analyze data at a much more granular level, with longer history
 Hadoop must be part of an overall solution and eco-system
IMPLEMENTATION
 We can reliably meet our production deliverable time-windows by using Hadoop
 We can largely eliminate the use of traditional ETL tools
 New Tools allow improved user experience on very large data sets
 We developed tools and skills – The learning curve is not to be underestimated
 We developed experience in moving workload from expensive, proprietary
mainframe and EDW platforms to Hadoop with spectacular results
UNIQUEVALUE
Over three years of experience using Hadoop for enterprise
legacy workload.
Thank You!
For further information
email:
visit:
contact@metascale.com
www.metascale.com

More Related Content

What's hot

Stinger Initiative - Deep Dive
Stinger Initiative - Deep DiveStinger Initiative - Deep Dive
Stinger Initiative - Deep Dive
Hortonworks
 
Tame that Beast
Tame that BeastTame that Beast
Oracle migrations and upgrades
Oracle migrations and upgradesOracle migrations and upgrades
Oracle migrations and upgrades
Durga Gadiraju
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
DataWorks Summit/Hadoop Summit
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
DataWorks Summit/Hadoop Summit
 
The Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open SourceThe Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open Source
DataWorks Summit/Hadoop Summit
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the Experts
DataWorks Summit/Hadoop Summit
 
DEVNET-1166 Open SDN Controller APIs
DEVNET-1166	Open SDN Controller APIsDEVNET-1166	Open SDN Controller APIs
DEVNET-1166 Open SDN Controller APIs
Cisco DevNet
 
Migrating Analytics to the Cloud at Fannie Mae
Migrating Analytics to the Cloud at Fannie MaeMigrating Analytics to the Cloud at Fannie Mae
Migrating Analytics to the Cloud at Fannie Mae
DataWorks Summit
 
What's new in apache hive
What's new in apache hive What's new in apache hive
What's new in apache hive
DataWorks Summit
 
Interactive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroDataInteractive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroData
Ofir Manor
 
Operationalizing Data Science Using Cloud Foundry
Operationalizing Data Science Using Cloud FoundryOperationalizing Data Science Using Cloud Foundry
Operationalizing Data Science Using Cloud Foundry
VMware Tanzu
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
markgrover
 
Apache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduceApache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduce
DataWorks Summit/Hadoop Summit
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China Mobile
DataWorks Summit
 
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
DataWorks Summit
 
Hadoop AWS infrastructure cost evaluation
Hadoop AWS infrastructure cost evaluationHadoop AWS infrastructure cost evaluation
Hadoop AWS infrastructure cost evaluation
mattlieber
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
Shweta Patnaik
 

What's hot (19)

Stinger Initiative - Deep Dive
Stinger Initiative - Deep DiveStinger Initiative - Deep Dive
Stinger Initiative - Deep Dive
 
Tame that Beast
Tame that BeastTame that Beast
Tame that Beast
 
Oracle migrations and upgrades
Oracle migrations and upgradesOracle migrations and upgrades
Oracle migrations and upgrades
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
 
The Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open SourceThe Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open Source
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the Experts
 
DEVNET-1166 Open SDN Controller APIs
DEVNET-1166	Open SDN Controller APIsDEVNET-1166	Open SDN Controller APIs
DEVNET-1166 Open SDN Controller APIs
 
Migrating Analytics to the Cloud at Fannie Mae
Migrating Analytics to the Cloud at Fannie MaeMigrating Analytics to the Cloud at Fannie Mae
Migrating Analytics to the Cloud at Fannie Mae
 
What's new in apache hive
What's new in apache hive What's new in apache hive
What's new in apache hive
 
Interactive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroDataInteractive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroData
 
Operationalizing Data Science Using Cloud Foundry
Operationalizing Data Science Using Cloud FoundryOperationalizing Data Science Using Cloud Foundry
Operationalizing Data Science Using Cloud Foundry
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Apache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduceApache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduce
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China Mobile
 
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
 
Hadoop AWS infrastructure cost evaluation
Hadoop AWS infrastructure cost evaluationHadoop AWS infrastructure cost evaluation
Hadoop AWS infrastructure cost evaluation
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 

Viewers also liked

Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
ArabNet ME
 
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera, Inc.
 
Rethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data HubRethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data Hub
Cloudera, Inc.
 
The 3 T's - Using Hadoop to modernize with faster access to data and value
The 3 T's - Using Hadoop to modernize with faster access to data and valueThe 3 T's - Using Hadoop to modernize with faster access to data and value
The 3 T's - Using Hadoop to modernize with faster access to data and value
DataWorks Summit
 
Hadoop in the Enterprise: Legacy Rides the Elephant
Hadoop in the Enterprise: Legacy Rides the ElephantHadoop in the Enterprise: Legacy Rides the Elephant
Hadoop in the Enterprise: Legacy Rides the Elephant
DataWorks Summit
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Technologies
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
Hortonworks
 
Big Data Business Wins: Real-time Inventory Tracking with Hadoop
Big Data Business Wins: Real-time Inventory Tracking with HadoopBig Data Business Wins: Real-time Inventory Tracking with Hadoop
Big Data Business Wins: Real-time Inventory Tracking with HadoopDataWorks Summit
 
Big Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data Era
Big Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data EraBig Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data Era
Big Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data Era
DataWorks Summit
 
Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...
Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...
Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...
MapR Technologies
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Cloudera, Inc.
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big Data
Cloudera, Inc.
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
Cloudera, Inc.
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingm_hepburn
 
Apache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at Cerner
HBaseCon
 
Big Data and Advanced Analytics
Big Data and Advanced AnalyticsBig Data and Advanced Analytics
Big Data and Advanced Analytics
McKinsey on Marketing & Sales
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
Edureka!
 
Customer Journey Analytics and Big Data
Customer Journey Analytics and Big DataCustomer Journey Analytics and Big Data
Customer Journey Analytics and Big Data
McKinsey on Marketing & Sales
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
Rahul Agarwal
 

Viewers also liked (20)

Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
 
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
 
Rethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data HubRethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data Hub
 
The 3 T's - Using Hadoop to modernize with faster access to data and value
The 3 T's - Using Hadoop to modernize with faster access to data and valueThe 3 T's - Using Hadoop to modernize with faster access to data and value
The 3 T's - Using Hadoop to modernize with faster access to data and value
 
Hadoop in the Enterprise: Legacy Rides the Elephant
Hadoop in the Enterprise: Legacy Rides the ElephantHadoop in the Enterprise: Legacy Rides the Elephant
Hadoop in the Enterprise: Legacy Rides the Elephant
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Big Data Business Wins: Real-time Inventory Tracking with Hadoop
Big Data Business Wins: Real-time Inventory Tracking with HadoopBig Data Business Wins: Real-time Inventory Tracking with Hadoop
Big Data Business Wins: Real-time Inventory Tracking with Hadoop
 
Big Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data Era
Big Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data EraBig Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data Era
Big Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data Era
 
Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...
Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...
Data on the Move: Transitioning from a Legacy Architecture to a Big Data Plat...
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big Data
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
 
Apache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at Cerner
 
Big Data and Advanced Analytics
Big Data and Advanced AnalyticsBig Data and Advanced Analytics
Big Data and Advanced Analytics
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
 
Customer Journey Analytics and Big Data
Customer Journey Analytics and Big DataCustomer Journey Analytics and Big Data
Customer Journey Analytics and Big Data
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 

Similar to Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point of truth: The reality of the enterprise data hub

50 Shades of SQL
50 Shades of SQL50 Shades of SQL
50 Shades of SQL
DataWorks Summit
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
Seeling Cheung
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
Michael Hiskey
 
SQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsightSQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsightTillmann Eitelberg
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
ssuserd3a367
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
sudhakara st
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...
Alluxio, Inc.
 
Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
chariorienit
 
Hadoop Infrastructure (Oct. 3rd, 2012)
Hadoop Infrastructure (Oct. 3rd, 2012)Hadoop Infrastructure (Oct. 3rd, 2012)
Hadoop Infrastructure (Oct. 3rd, 2012)John Dougherty
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
Caserta
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarKognitio
 
Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS
Alluxio, Inc.
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Precisely
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Cloudera, Inc.
 
Gluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with HadoopGluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with Hadoop
gluent.
 
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
Roorkee College of Engineering, Roorkee
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
arslanhaneef
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
sonukumar379092
 
Retail & CPG
Retail & CPGRetail & CPG
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Alluxio, Inc.
 

Similar to Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point of truth: The reality of the enterprise data hub (20)

50 Shades of SQL
50 Shades of SQL50 Shades of SQL
50 Shades of SQL
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
SQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsightSQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsight
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...
 
Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
 
Hadoop Infrastructure (Oct. 3rd, 2012)
Hadoop Infrastructure (Oct. 3rd, 2012)Hadoop Infrastructure (Oct. 3rd, 2012)
Hadoop Infrastructure (Oct. 3rd, 2012)
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 
Gluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with HadoopGluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with Hadoop
 
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Retail & CPG
Retail & CPGRetail & CPG
Retail & CPG
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
 

More from Global Business Events

Cio Event
Cio EventCio Event
Ludo Van den Kerckhove , Managing Partner at A-cross Health - The Network Alw...
Ludo Van den Kerckhove , Managing Partner at A-cross Health - The Network Alw...Ludo Van den Kerckhove , Managing Partner at A-cross Health - The Network Alw...
Ludo Van den Kerckhove , Managing Partner at A-cross Health - The Network Alw...
Global Business Events
 
Tim Mann, CIO at NFU Mutual - Digital Transformation Case Studies: how NFUM i...
Tim Mann, CIO at NFU Mutual - Digital Transformation Case Studies: how NFUM i...Tim Mann, CIO at NFU Mutual - Digital Transformation Case Studies: how NFUM i...
Tim Mann, CIO at NFU Mutual - Digital Transformation Case Studies: how NFUM i...
Global Business Events
 
Neil Ward-Dutton, Founder & Research Director at MWD Advisors - Innovating di...
Neil Ward-Dutton, Founder & Research Director at MWD Advisors - Innovating di...Neil Ward-Dutton, Founder & Research Director at MWD Advisors - Innovating di...
Neil Ward-Dutton, Founder & Research Director at MWD Advisors - Innovating di...
Global Business Events
 
Mark Jacot, Assistant Director – IT Service Deliveryat The Open University - ...
Mark Jacot, Assistant Director – IT Service Deliveryat The Open University - ...Mark Jacot, Assistant Director – IT Service Deliveryat The Open University - ...
Mark Jacot, Assistant Director – IT Service Deliveryat The Open University - ...
Global Business Events
 
Gerard O'Hara, Head of IT EMEA at Facebook - How the Facebook IT department i...
Gerard O'Hara, Head of IT EMEA at Facebook - How the Facebook IT department i...Gerard O'Hara, Head of IT EMEA at Facebook - How the Facebook IT department i...
Gerard O'Hara, Head of IT EMEA at Facebook - How the Facebook IT department i...
Global Business Events
 
Hakan Yaren, Managing Director IT at FedEx Express EMEA - IT Modernisation
Hakan Yaren, Managing Director IT at FedEx Express EMEA - IT ModernisationHakan Yaren, Managing Director IT at FedEx Express EMEA - IT Modernisation
Hakan Yaren, Managing Director IT at FedEx Express EMEA - IT Modernisation
Global Business Events
 
Sam De Silva, Partner - Head of IT and Outsourcing Group at Penningtons Manch...
Sam De Silva, Partner - Head of IT and Outsourcing Group at Penningtons Manch...Sam De Silva, Partner - Head of IT and Outsourcing Group at Penningtons Manch...
Sam De Silva, Partner - Head of IT and Outsourcing Group at Penningtons Manch...
Global Business Events
 
Hugo Smith, CTO at Broadbandchoices - Improving the Agility of your Business ...
Hugo Smith, CTO at Broadbandchoices - Improving the Agility of your Business ...Hugo Smith, CTO at Broadbandchoices - Improving the Agility of your Business ...
Hugo Smith, CTO at Broadbandchoices - Improving the Agility of your Business ...
Global Business Events
 
Mark Aikman, IT Director at The North Group - Leading a Complex Bespoke Syste...
Mark Aikman, IT Director at The North Group - Leading a Complex Bespoke Syste...Mark Aikman, IT Director at The North Group - Leading a Complex Bespoke Syste...
Mark Aikman, IT Director at The North Group - Leading a Complex Bespoke Syste...
Global Business Events
 
David Clarke, CITSO at Digital Arena - Security Benchmarking, best practise a...
David Clarke, CITSO at Digital Arena - Security Benchmarking, best practise a...David Clarke, CITSO at Digital Arena - Security Benchmarking, best practise a...
David Clarke, CITSO at Digital Arena - Security Benchmarking, best practise a...
Global Business Events
 
John Prowse, vCISO at BT - Security Anxiety
John Prowse, vCISO at BT - Security AnxietyJohn Prowse, vCISO at BT - Security Anxiety
John Prowse, vCISO at BT - Security Anxiety
Global Business Events
 
Kevin Watkins, Enterprise Security Architect at BAT - BAT’s Managed Security ...
Kevin Watkins, Enterprise Security Architect at BAT - BAT’s Managed Security ...Kevin Watkins, Enterprise Security Architect at BAT - BAT’s Managed Security ...
Kevin Watkins, Enterprise Security Architect at BAT - BAT’s Managed Security ...
Global Business Events
 
Keith Inight, CTO at Atos - Software Defined Everything
Keith Inight, CTO at Atos - Software Defined EverythingKeith Inight, CTO at Atos - Software Defined Everything
Keith Inight, CTO at Atos - Software Defined Everything
Global Business Events
 
David Clarke, CITSO at Vciso - Security, Standards and Swiss Cheese
David Clarke, CITSO at Vciso - Security, Standards and Swiss CheeseDavid Clarke, CITSO at Vciso - Security, Standards and Swiss Cheese
David Clarke, CITSO at Vciso - Security, Standards and Swiss Cheese
Global Business Events
 
Dave Jones, CIO at Cape Plc - Transition of Autonomous regional IT to Providi...
Dave Jones, CIO at Cape Plc - Transition of Autonomous regional IT to Providi...Dave Jones, CIO at Cape Plc - Transition of Autonomous regional IT to Providi...
Dave Jones, CIO at Cape Plc - Transition of Autonomous regional IT to Providi...
Global Business Events
 
Wolfgang Kuhl, CIO at Pharmaserv - Data Centre Planning and Execution - A Sur...
Wolfgang Kuhl, CIO at Pharmaserv - Data Centre Planning and Execution - A Sur...Wolfgang Kuhl, CIO at Pharmaserv - Data Centre Planning and Execution - A Sur...
Wolfgang Kuhl, CIO at Pharmaserv - Data Centre Planning and Execution - A Sur...
Global Business Events
 
Mark Aikman, CIO at The North Group - Leading a Complex Bespoke System Transf...
Mark Aikman, CIO at The North Group - Leading a Complex Bespoke System Transf...Mark Aikman, CIO at The North Group - Leading a Complex Bespoke System Transf...
Mark Aikman, CIO at The North Group - Leading a Complex Bespoke System Transf...
Global Business Events
 
Neil Ward-Dutton, Co-founder and Research Director at MWD Advisors - Digital ...
Neil Ward-Dutton, Co-founder and Research Director at MWD Advisors - Digital ...Neil Ward-Dutton, Co-founder and Research Director at MWD Advisors - Digital ...
Neil Ward-Dutton, Co-founder and Research Director at MWD Advisors - Digital ...
Global Business Events
 
Gordon Tredgold, SVP Global IT at Henkel - Fast Leadership - Accelerating Pro...
Gordon Tredgold, SVP Global IT at Henkel - Fast Leadership - Accelerating Pro...Gordon Tredgold, SVP Global IT at Henkel - Fast Leadership - Accelerating Pro...
Gordon Tredgold, SVP Global IT at Henkel - Fast Leadership - Accelerating Pro...
Global Business Events
 

More from Global Business Events (20)

Cio Event
Cio EventCio Event
Cio Event
 
Ludo Van den Kerckhove , Managing Partner at A-cross Health - The Network Alw...
Ludo Van den Kerckhove , Managing Partner at A-cross Health - The Network Alw...Ludo Van den Kerckhove , Managing Partner at A-cross Health - The Network Alw...
Ludo Van den Kerckhove , Managing Partner at A-cross Health - The Network Alw...
 
Tim Mann, CIO at NFU Mutual - Digital Transformation Case Studies: how NFUM i...
Tim Mann, CIO at NFU Mutual - Digital Transformation Case Studies: how NFUM i...Tim Mann, CIO at NFU Mutual - Digital Transformation Case Studies: how NFUM i...
Tim Mann, CIO at NFU Mutual - Digital Transformation Case Studies: how NFUM i...
 
Neil Ward-Dutton, Founder & Research Director at MWD Advisors - Innovating di...
Neil Ward-Dutton, Founder & Research Director at MWD Advisors - Innovating di...Neil Ward-Dutton, Founder & Research Director at MWD Advisors - Innovating di...
Neil Ward-Dutton, Founder & Research Director at MWD Advisors - Innovating di...
 
Mark Jacot, Assistant Director – IT Service Deliveryat The Open University - ...
Mark Jacot, Assistant Director – IT Service Deliveryat The Open University - ...Mark Jacot, Assistant Director – IT Service Deliveryat The Open University - ...
Mark Jacot, Assistant Director – IT Service Deliveryat The Open University - ...
 
Gerard O'Hara, Head of IT EMEA at Facebook - How the Facebook IT department i...
Gerard O'Hara, Head of IT EMEA at Facebook - How the Facebook IT department i...Gerard O'Hara, Head of IT EMEA at Facebook - How the Facebook IT department i...
Gerard O'Hara, Head of IT EMEA at Facebook - How the Facebook IT department i...
 
Hakan Yaren, Managing Director IT at FedEx Express EMEA - IT Modernisation
Hakan Yaren, Managing Director IT at FedEx Express EMEA - IT ModernisationHakan Yaren, Managing Director IT at FedEx Express EMEA - IT Modernisation
Hakan Yaren, Managing Director IT at FedEx Express EMEA - IT Modernisation
 
Sam De Silva, Partner - Head of IT and Outsourcing Group at Penningtons Manch...
Sam De Silva, Partner - Head of IT and Outsourcing Group at Penningtons Manch...Sam De Silva, Partner - Head of IT and Outsourcing Group at Penningtons Manch...
Sam De Silva, Partner - Head of IT and Outsourcing Group at Penningtons Manch...
 
Hugo Smith, CTO at Broadbandchoices - Improving the Agility of your Business ...
Hugo Smith, CTO at Broadbandchoices - Improving the Agility of your Business ...Hugo Smith, CTO at Broadbandchoices - Improving the Agility of your Business ...
Hugo Smith, CTO at Broadbandchoices - Improving the Agility of your Business ...
 
Mark Aikman, IT Director at The North Group - Leading a Complex Bespoke Syste...
Mark Aikman, IT Director at The North Group - Leading a Complex Bespoke Syste...Mark Aikman, IT Director at The North Group - Leading a Complex Bespoke Syste...
Mark Aikman, IT Director at The North Group - Leading a Complex Bespoke Syste...
 
David Clarke, CITSO at Digital Arena - Security Benchmarking, best practise a...
David Clarke, CITSO at Digital Arena - Security Benchmarking, best practise a...David Clarke, CITSO at Digital Arena - Security Benchmarking, best practise a...
David Clarke, CITSO at Digital Arena - Security Benchmarking, best practise a...
 
John Prowse, vCISO at BT - Security Anxiety
John Prowse, vCISO at BT - Security AnxietyJohn Prowse, vCISO at BT - Security Anxiety
John Prowse, vCISO at BT - Security Anxiety
 
Kevin Watkins, Enterprise Security Architect at BAT - BAT’s Managed Security ...
Kevin Watkins, Enterprise Security Architect at BAT - BAT’s Managed Security ...Kevin Watkins, Enterprise Security Architect at BAT - BAT’s Managed Security ...
Kevin Watkins, Enterprise Security Architect at BAT - BAT’s Managed Security ...
 
Keith Inight, CTO at Atos - Software Defined Everything
Keith Inight, CTO at Atos - Software Defined EverythingKeith Inight, CTO at Atos - Software Defined Everything
Keith Inight, CTO at Atos - Software Defined Everything
 
David Clarke, CITSO at Vciso - Security, Standards and Swiss Cheese
David Clarke, CITSO at Vciso - Security, Standards and Swiss CheeseDavid Clarke, CITSO at Vciso - Security, Standards and Swiss Cheese
David Clarke, CITSO at Vciso - Security, Standards and Swiss Cheese
 
Dave Jones, CIO at Cape Plc - Transition of Autonomous regional IT to Providi...
Dave Jones, CIO at Cape Plc - Transition of Autonomous regional IT to Providi...Dave Jones, CIO at Cape Plc - Transition of Autonomous regional IT to Providi...
Dave Jones, CIO at Cape Plc - Transition of Autonomous regional IT to Providi...
 
Wolfgang Kuhl, CIO at Pharmaserv - Data Centre Planning and Execution - A Sur...
Wolfgang Kuhl, CIO at Pharmaserv - Data Centre Planning and Execution - A Sur...Wolfgang Kuhl, CIO at Pharmaserv - Data Centre Planning and Execution - A Sur...
Wolfgang Kuhl, CIO at Pharmaserv - Data Centre Planning and Execution - A Sur...
 
Mark Aikman, CIO at The North Group - Leading a Complex Bespoke System Transf...
Mark Aikman, CIO at The North Group - Leading a Complex Bespoke System Transf...Mark Aikman, CIO at The North Group - Leading a Complex Bespoke System Transf...
Mark Aikman, CIO at The North Group - Leading a Complex Bespoke System Transf...
 
Neil Ward-Dutton, Co-founder and Research Director at MWD Advisors - Digital ...
Neil Ward-Dutton, Co-founder and Research Director at MWD Advisors - Digital ...Neil Ward-Dutton, Co-founder and Research Director at MWD Advisors - Digital ...
Neil Ward-Dutton, Co-founder and Research Director at MWD Advisors - Digital ...
 
Gordon Tredgold, SVP Global IT at Henkel - Fast Leadership - Accelerating Pro...
Gordon Tredgold, SVP Global IT at Henkel - Fast Leadership - Accelerating Pro...Gordon Tredgold, SVP Global IT at Henkel - Fast Leadership - Accelerating Pro...
Gordon Tredgold, SVP Global IT at Henkel - Fast Leadership - Accelerating Pro...
 

Recently uploaded

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 

Recently uploaded (20)

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 

Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point of truth: The reality of the enterprise data hub

  • 1. 1 Single Point of Truth The Reality of the Enterprise Data Hub Justin Sheppard Ankur Gupta Sears Holdings Corporation
  • 2. 2 • Not meeting production schedules • Multiple copies of data, no single point of truth • ETL complexity, cost of software and cost to manage • Time to setup ETL data sources for projects • Latency in data (up to weeks in some cases) • Enterprise Data Warehouses unable to handle load • Mainframe workload over consuming capacity • IT Budgets not growing – BUT data volumes escalating Where Did We Start?
  • 3. What Is Hadoop? 3 Hadoop Distributed File System (HDFS) File Sharing & Data Protection Across Physical Servers MapReduce Fault Tolerant Distributed Computing Across Physical Servers Flexibility o A single repository for storing processing & analyzing any type of data (structured and complex) o Not bound by a single schema Scalability o Scale-out architecture divides workloads across multiple nodes o Flexible file system eliminates ETL bottlenecks Low Cost o Can be deployed on commodity hardware o Open source platform guards against vendor lock Hadoop is a platform for data storage and processing that is… o Scalable o Fault tolerant o Open source
  • 4. 4 Hadoop IS • Store vast amounts of data • Run queries on huge data sets • Ask questions previously impossible • Archive data but still analyze it • Capture data streams at incredible speeds • Massively reduce data latency • Transform your thinking about ETL Is Not • High-speed SQL database • Simple • Easily connected to legacy systems • A replacement for your current data warehouse • Going to be built or operated by your DBA's • Going to make any sense to your data architects • Going to be possible if do not have Linux skills
  • 5. 5 Use The Right Tool For The Right Job Databases: Hadoop: When to use? • Affordable Storage/Compute • High-performance queries on large data • Complex data • Resilient Auto Scalability When to use? • Transactional, High Speed Analytics • Interactive Reporting (<1sec) • Multi-step Transactions • Numerous Inserts/Updates/Deletes Can be combined
  • 6. Use The Right Tool For The Right Job 6 Hadoop Database
  • 7. Data Hub 7 • Underlying premise as Hadoop adoption continues – source data once, use many. • Over time, as more and more data is sourced, development times will reduce since data sourcing is significantly less than typical.
  • 9. The First Usage in Production Use Case • Interactive presentation layer was required to present item/price/sales data in a highly flexible user interface with rapid response time • Needed to deliver solution within a very short period of time. • Legacy architecture would have required a MicroStrategy solution utilizing 1,000’s of cubes on many expensive servers Approach • Rapid development project initiated to present item/price/sales data in a highly flexible user interface with rapid response time • Built system from the ground up • Migrated all required data to centralized HDFS repository from legacy databases • Developed MapReduce code to process daily data files into 4 primary data tables • Tables extracted to service layer (MySQL/Infobrite) for presentation through the Pricing Portal Results • File preparation completes in minutes each day and ensures portal data is ready very soon after daily sales processing completes (100K records daily) • This was the first production usage of MapReduce and associated technologies – the project initiated in March and was live on May 9 (<10 weeks concept to realization) Technologies Used • Hadoop, Hive, MapReduce, MySql, Infobright, Linux, REST Web Service, Dotnetnuke 9 Learning experience for all parties, successfully demonstrated platform abilities in production environment – but we would NOT do it this way again…
  • 10. Mainframe Migration 10 Step 1 Source 1 Source 2 Step 2 Step 3 Step 4 Step 5 Source 3 Source 4 Output As our experience with Hadoop increased, hypothesis were formed that the technology could aid with SHC’s mainframe migration initiative. Example above represents a simply mainframe process Step 1 Source 1 Source 2 Step 2 Step 3 Step 4 Step 5 Source 3 Source 4 Output Step 4 Step 5 X X Migrated sections of mainframe processing, including data transfer to Hadoop and back, eliminating MIPS and IMPROVING overall cycle time
  • 11. ETL Replacement • A major ongoing system effort in our Marketing department was heavily reliant on DataStage processing for ETL – In the early stages of deployment the ETL platform performed within acceptable limits – As volume increased the system began to have performance issues as the ETL platform degraded – With full rollout imminent, the options were to heavily invest in additional hardware – or – re-work CPU-intensive portions in Hadoop 11 • Experience with mainframe migration evolved to ETL replacement . • SHC successfully demonstrated reducing load on costly ETL software with PiG scripts (and data movement from / to ETL platform as an intermediate step). • AND with improved processing time…
  • 12. ETL Replacement 12 • Section in RED – processing duration had far exceeded SLA. • Using similar approach to mainframe migration, components of the process were migrated to Hadoop (in PiG). • Data movement plus processing on Hadoop was more predictable and efficient, regardless of volume, than prior environment – with no additional investment.
  • 13. The Journey • From Legacy (> 1000 lines) to Ruby / MapReduce (400 lines) – Cryptic code, difficult to support, difficult to train • We tried HIVE (~400 lines - Sql-like abstraction) – Easy to use, easy to experiment and test with – Poor performance, difficult to implement business logic • We evolved to PiG with Java UDF extensions – Compressed, very efficient, easy to code / read (~200 lines) – Demonstrated success in transforming mainframe developers to PiG developers in under 2 weeks • As we progressed, our business partners requested more and more data from the cluster – which required developer time – We are now using Datameer as a business-user reporting and query front-end to the cluster – Developer for Hadoop, runs efficiently, flexible spreadsheet interface with dashboards 13 We are in a much different place now than when we started our Hadoop journey.
  • 14. 14 The LearningHADOOP  We can dramatically reduce batch processing times for mainframe and EDW  We can retain and analyze data at a much more granular level, with longer history  Hadoop must be part of an overall solution and eco-system IMPLEMENTATION  We can reliably meet our production deliverable time-windows by using Hadoop  We can largely eliminate the use of traditional ETL tools  New Tools allow improved user experience on very large data sets  We developed tools and skills – The learning curve is not to be underestimated  We developed experience in moving workload from expensive, proprietary mainframe and EDW platforms to Hadoop with spectacular results UNIQUEVALUE Over three years of experience using Hadoop for enterprise legacy workload.
  • 15. Thank You! For further information email: visit: contact@metascale.com www.metascale.com