SlideShare a Scribd company logo

Transforming Data Architecture Complexity at Sears - StampedeCon 2013

At the StampedeCon 2013 Big Data conference in St. Louis, Justin Sheppard discussed Transforming Data Architecture Complexity at Sears. High ETL complexity and costs, data latency and redundancy, and batch window limits are just some of the IT challenges caused by traditional data warehouses. Gain an understanding of big data tools through the use cases and technology that enables Sears to solve the problems of the traditional enterprise data warehouse approach. Learn how Sears uses Hadoop as a data hub to minimize data architecture complexity – resulting in a reduction of time to insight by 30-70% – and discover “quick wins” such as mainframe MIPS reduction.

Transforming Data Architecture Complexity at Sears - StampedeCon 2013

1 of 14
1	
  
Transforming	
  Data	
  Architecture	
  
Complexity	
  at	
  Sears	
  
Jus:n	
  Sheppard	
  
Sears	
  Holdings	
  Corpora1on	
  
2	
  
	
  
•  Not	
  mee1ng	
  produc1on	
  schedules	
  
•  Mul1ple	
  copies	
  of	
  data,	
  no	
  single	
  point	
  of	
  truth	
  
•  ETL	
  complexity,	
  cost	
  of	
  soAware	
  and	
  cost	
  to	
  manage	
  
•  Time	
  to	
  setup	
  ETL	
  data	
  sources	
  for	
  projects	
  
•  Latency	
  in	
  data	
  (up	
  to	
  weeks	
  in	
  some	
  cases)	
  
•  Enterprise	
  Data	
  Warehouses	
  unable	
  to	
  handle	
  load	
  
•  Mainframe	
  workload	
  over	
  consuming	
  capacity	
  
•  IT	
  Budgets	
  not	
  growing	
  –	
  BUT	
  data	
  volumes	
  escala1ng	
  
Where	
  Did	
  We	
  Start?	
  
What	
  Is	
  Hadoop?	
  
3	
  
Hadoop	
  Distributed	
  
File	
  System	
  (HDFS)	
  
	
  
File	
  Sharing	
  &	
  Data	
  
Protec1on	
  Across	
  
Physical	
  Servers	
  
MapReduce	
  
	
  
Fault	
  Tolerant	
  
Distributed	
  
Compu1ng	
  Across	
  
Physical	
  Servers	
  
Flexibility	
  
	
  
o A	
  single	
  repository	
  for	
  
storing	
  processing	
  &	
  
analyzing	
  any	
  type	
  of	
  data	
  
(structured	
  and	
  complex)	
  
o Not	
  bound	
  by	
  a	
  single	
  
schema	
  
Scalability	
  
	
  
o Scale-­‐out	
  architecture	
  divides	
  
workloads	
  across	
  mul1ple	
  
nodes	
  
o Flexible	
  file	
  system	
  eliminates	
  
ETL	
  boXlenecks	
  
Low	
  Cost	
  
	
  
o Can	
  be	
  deployed	
  on	
  
commodity	
  hardware	
  
o Open	
  source	
  plaZorm	
  guards	
  
against	
  vendor	
  lock	
  
Hadoop	
  is	
  a	
  plaZorm	
  for	
  data	
  storage	
  
and	
  processing	
  that	
  is…	
  
o  Scalable	
  
o  Fault	
  tolerant	
  
o  Open	
  source	
  
4	
  
Hadoop	
  
IS	
  
•  Store	
  vast	
  amounts	
  of	
  data	
  
•  Run	
  queries	
  on	
  huge	
  data	
  
sets	
  
•  Ask	
  ques1ons	
  previously	
  
impossible	
  
•  Archive	
  data	
  but	
  s1ll	
  
analyze	
  it	
  
•  Capture	
  data	
  streams	
  at	
  
incredible	
  speeds	
  
•  Massively	
  reduce	
  data	
  
latency	
  
•  Transform	
  your	
  thinking	
  
about	
  ETL	
  
Is	
  Not	
  
•  High-­‐speed	
  SQL	
  database	
  
•  Simple	
  
•  Easily	
  connected	
  to	
  legacy	
  
systems	
  
•  A	
  replacement	
  for	
  your	
  
current	
  data	
  warehouse	
  
•  Going	
  to	
  be	
  built	
  or	
  
operated	
  by	
  your	
  DBA's	
  
•  Going	
  to	
  make	
  any	
  sense	
  
to	
  your	
  data	
  architects	
  
•  Going	
  to	
  be	
  possible	
  if	
  do	
  
not	
  have	
  Linux	
  skills	
  
5	
  
Use	
  The	
  Right	
  Tool	
  For	
  The	
  Right	
  Job	
  
Databases:	
   Hadoop:	
  
When to use?
•  Affordable Storage/Compute
•  High-performance queries on large data
•  Complex data
•  Resilient Auto Scalability
When to use?
•  Transactional, High Speed Analytics
•  Interactive Reporting (<1sec)
•  Multi-step Transactions
•  Numerous Inserts/Updates/Deletes
Can be combined
Use	
  The	
  Right	
  Tool	
  For	
  The	
  Right	
  Job	
  
6	
  
Hadoop
Database
Ad

Recommended

Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Global Business Events
 
Combining Machine Learning frameworks with Apache Spark
Combining Machine Learning frameworks with Apache SparkCombining Machine Learning frameworks with Apache Spark
Combining Machine Learning frameworks with Apache SparkDataWorks Summit/Hadoop Summit
 
SQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsightSQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsightTillmann Eitelberg
 
Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryPreventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryDataWorks Summit/Hadoop Summit
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
 
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop DataWorks Summit/Hadoop Summit
 
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Chris Nauroth
 

More Related Content

What's hot

Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureDataWorks Summit
 
Lessons learned from running Spark on Docker
Lessons learned from running Spark on DockerLessons learned from running Spark on Docker
Lessons learned from running Spark on DockerDataWorks Summit
 
Stinger Initiative - Deep Dive
Stinger Initiative - Deep DiveStinger Initiative - Deep Dive
Stinger Initiative - Deep DiveHortonworks
 
Operationalizing Data Science Using Cloud Foundry
Operationalizing Data Science Using Cloud FoundryOperationalizing Data Science Using Cloud Foundry
Operationalizing Data Science Using Cloud FoundryVMware Tanzu
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impalamarkgrover
 
Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Hortonworks
 
Graphene – Microsoft SCOPE on Tez
Graphene – Microsoft SCOPE on Tez Graphene – Microsoft SCOPE on Tez
Graphene – Microsoft SCOPE on Tez DataWorks Summit
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobileDataWorks Summit
 
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
Solving Hadoop Replication Challenges with an Active-Active Paxos AlgorithmSolving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
Solving Hadoop Replication Challenges with an Active-Active Paxos AlgorithmDataWorks Summit
 
Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016Adam Doyle
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkDataWorks Summit
 
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query ProcessingApache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query ProcessingBikas Saha
 
Pig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big DataPig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big DataDataWorks Summit
 

What's hot (20)

Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
 
Lessons learned from running Spark on Docker
Lessons learned from running Spark on DockerLessons learned from running Spark on Docker
Lessons learned from running Spark on Docker
 
Stinger Initiative - Deep Dive
Stinger Initiative - Deep DiveStinger Initiative - Deep Dive
Stinger Initiative - Deep Dive
 
Apache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduceApache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduce
 
NoSQL Needs SomeSQL
NoSQL Needs SomeSQLNoSQL Needs SomeSQL
NoSQL Needs SomeSQL
 
Hadoop 3 in a Nutshell
Hadoop 3 in a NutshellHadoop 3 in a Nutshell
Hadoop 3 in a Nutshell
 
Operationalizing Data Science Using Cloud Foundry
Operationalizing Data Science Using Cloud FoundryOperationalizing Data Science Using Cloud Foundry
Operationalizing Data Science Using Cloud Foundry
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks
 
Deep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profitDeep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profit
 
Graphene – Microsoft SCOPE on Tez
Graphene – Microsoft SCOPE on Tez Graphene – Microsoft SCOPE on Tez
Graphene – Microsoft SCOPE on Tez
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China Mobile
 
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
Solving Hadoop Replication Challenges with an Active-Active Paxos AlgorithmSolving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
 
YARN Federation
YARN Federation YARN Federation
YARN Federation
 
Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016
 
Time-oriented event search. A new level of scale
Time-oriented event search. A new level of scale Time-oriented event search. A new level of scale
Time-oriented event search. A new level of scale
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query ProcessingApache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query Processing
 
Pig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big DataPig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big Data
 

Viewers also liked

Sears Holdings Corp.
Sears Holdings Corp.Sears Holdings Corp.
Sears Holdings Corp.msg14
 
BlogWell San Francisco Case Study: Sears Holdings Corporation, presented by J...
BlogWell San Francisco Case Study: Sears Holdings Corporation, presented by J...BlogWell San Francisco Case Study: Sears Holdings Corporation, presented by J...
BlogWell San Francisco Case Study: Sears Holdings Corporation, presented by J...SocialMedia.org
 
Best practices in outsourcing : The case of Sears Holdings
Best practices in outsourcing : The case of Sears HoldingsBest practices in outsourcing : The case of Sears Holdings
Best practices in outsourcing : The case of Sears HoldingsAlok Kumar
 
The 3 T's - Using Hadoop to modernize with faster access to data and value
The 3 T's - Using Hadoop to modernize with faster access to data and valueThe 3 T's - Using Hadoop to modernize with faster access to data and value
The 3 T's - Using Hadoop to modernize with faster access to data and valueDataWorks Summit
 
Hadoop in the Enterprise: Legacy Rides the Elephant
Hadoop in the Enterprise: Legacy Rides the ElephantHadoop in the Enterprise: Legacy Rides the Elephant
Hadoop in the Enterprise: Legacy Rides the ElephantDataWorks Summit
 
Sears Holdings Corporation
Sears Holdings CorporationSears Holdings Corporation
Sears Holdings CorporationSam Hudson
 
Organization And Management Kmart
Organization And Management KmartOrganization And Management Kmart
Organization And Management Kmartguest634b8da
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Big Data Business Wins: Real-time Inventory Tracking with Hadoop
Big Data Business Wins: Real-time Inventory Tracking with HadoopBig Data Business Wins: Real-time Inventory Tracking with Hadoop
Big Data Business Wins: Real-time Inventory Tracking with HadoopDataWorks Summit
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopDatameer
 
Strategy analysis Target vs. Kmart
Strategy analysis Target vs. KmartStrategy analysis Target vs. Kmart
Strategy analysis Target vs. KmartDan Saguy
 
E-Business transformation-Sears Case Study
E-Business transformation-Sears Case StudyE-Business transformation-Sears Case Study
E-Business transformation-Sears Case StudyDanny D. Kosasih
 
Big Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data Era
Big Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data EraBig Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data Era
Big Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data EraDataWorks Summit
 
Sears Hometown Store Overview
Sears Hometown Store OverviewSears Hometown Store Overview
Sears Hometown Store Overviewctodd001
 
Strategy recommendation for Sears
Strategy recommendation for SearsStrategy recommendation for Sears
Strategy recommendation for SearsDev Anumolu
 
Big Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business SchoolBig Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business SchoolLaurent Kinet
 

Viewers also liked (20)

Kmart
KmartKmart
Kmart
 
Sears Holdings Corp.
Sears Holdings Corp.Sears Holdings Corp.
Sears Holdings Corp.
 
BlogWell San Francisco Case Study: Sears Holdings Corporation, presented by J...
BlogWell San Francisco Case Study: Sears Holdings Corporation, presented by J...BlogWell San Francisco Case Study: Sears Holdings Corporation, presented by J...
BlogWell San Francisco Case Study: Sears Holdings Corporation, presented by J...
 
Best practices in outsourcing : The case of Sears Holdings
Best practices in outsourcing : The case of Sears HoldingsBest practices in outsourcing : The case of Sears Holdings
Best practices in outsourcing : The case of Sears Holdings
 
The 3 T's - Using Hadoop to modernize with faster access to data and value
The 3 T's - Using Hadoop to modernize with faster access to data and valueThe 3 T's - Using Hadoop to modernize with faster access to data and value
The 3 T's - Using Hadoop to modernize with faster access to data and value
 
Hadoop in the Enterprise: Legacy Rides the Elephant
Hadoop in the Enterprise: Legacy Rides the ElephantHadoop in the Enterprise: Legacy Rides the Elephant
Hadoop in the Enterprise: Legacy Rides the Elephant
 
Sears Holdings Corporation
Sears Holdings CorporationSears Holdings Corporation
Sears Holdings Corporation
 
Kmart pp2
Kmart pp2Kmart pp2
Kmart pp2
 
Organization And Management Kmart
Organization And Management KmartOrganization And Management Kmart
Organization And Management Kmart
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Big Data Business Wins: Real-time Inventory Tracking with Hadoop
Big Data Business Wins: Real-time Inventory Tracking with HadoopBig Data Business Wins: Real-time Inventory Tracking with Hadoop
Big Data Business Wins: Real-time Inventory Tracking with Hadoop
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & Hadoop
 
Strategy analysis Target vs. Kmart
Strategy analysis Target vs. KmartStrategy analysis Target vs. Kmart
Strategy analysis Target vs. Kmart
 
E-Business transformation-Sears Case Study
E-Business transformation-Sears Case StudyE-Business transformation-Sears Case Study
E-Business transformation-Sears Case Study
 
Big Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data Era
Big Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data EraBig Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data Era
Big Data 2.0: Hadoop as part of a Near-Real-Time Integrated Data Era
 
Sears Hometown Store Overview
Sears Hometown Store OverviewSears Hometown Store Overview
Sears Hometown Store Overview
 
Sears Final Project
Sears Final ProjectSears Final Project
Sears Final Project
 
Case Study: Sears
Case Study: SearsCase Study: Sears
Case Study: Sears
 
Strategy recommendation for Sears
Strategy recommendation for SearsStrategy recommendation for Sears
Strategy recommendation for Sears
 
Big Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business SchoolBig Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business School
 

Similar to Transforming Data Architecture Complexity at Sears - StampedeCon 2013

Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMichael Hiskey
 
Rapid Cluster Computing with Apache Spark 2016
Rapid Cluster Computing with Apache Spark 2016Rapid Cluster Computing with Apache Spark 2016
Rapid Cluster Computing with Apache Spark 2016Zohar Elkayam
 
Gluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with HadoopGluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with Hadoopgluent.
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game ChangerCaserta
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016MLconf
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the OrganizationSeeling Cheung
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...Alluxio, Inc.
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...ssuserd3a367
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSatish Mohan
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Precisely
 
Scaling ETL with Hadoop - Avoiding Failure
Scaling ETL with Hadoop - Avoiding FailureScaling ETL with Hadoop - Avoiding Failure
Scaling ETL with Hadoop - Avoiding FailureGwen (Chen) Shapira
 
Impala use case @ edge
Impala use case @ edgeImpala use case @ edge
Impala use case @ edgeRam Kedem
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarKognitio
 
Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS Alluxio, Inc.
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindAvere Systems
 
Introduction to Impala
Introduction to ImpalaIntroduction to Impala
Introduction to Impalamarkgrover
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdataTom Rogers
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics PlatformN Masahiro
 

Similar to Transforming Data Architecture Complexity at Sears - StampedeCon 2013 (20)

50 Shades of SQL
50 Shades of SQL50 Shades of SQL
50 Shades of SQL
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Rapid Cluster Computing with Apache Spark 2016
Rapid Cluster Computing with Apache Spark 2016Rapid Cluster Computing with Apache Spark 2016
Rapid Cluster Computing with Apache Spark 2016
 
Gluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with HadoopGluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with Hadoop
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
Scaling ETL with Hadoop - Avoiding Failure
Scaling ETL with Hadoop - Avoiding FailureScaling ETL with Hadoop - Avoiding Failure
Scaling ETL with Hadoop - Avoiding Failure
 
Impala use case @ edge
Impala use case @ edgeImpala use case @ edge
Impala use case @ edge
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
 
Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
 
Introduction to Impala
Introduction to ImpalaIntroduction to Impala
Introduction to Impala
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdata
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
 

More from StampedeCon

Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...StampedeCon
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017StampedeCon
 
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017StampedeCon
 
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...StampedeCon
 
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017StampedeCon
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017StampedeCon
 
Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017StampedeCon
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...StampedeCon
 
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...StampedeCon
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017StampedeCon
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017StampedeCon
 
A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017StampedeCon
 
Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017StampedeCon
 
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017StampedeCon
 
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017StampedeCon
 
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...StampedeCon
 
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...StampedeCon
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016StampedeCon
 
Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016StampedeCon
 
Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016StampedeCon
 

More from StampedeCon (20)

Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
 
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
 
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
 
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
 
Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
 
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
 
A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017
 
Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017
 
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
 
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
 
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
 
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
 
Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016
 
Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016
 

Recently uploaded

Enhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for PartnersEnhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for PartnersThousandEyes
 
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...Adrian Sanabria
 
AWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user groupAWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user groupAWS Chicago
 
Put a flag on it. A busy developer's guide to feature toggles.
Put a flag on it. A busy developer's guide to feature toggles.Put a flag on it. A busy developer's guide to feature toggles.
Put a flag on it. A busy developer's guide to feature toggles.Mateusz Kwasniewski
 
Q1 Memory Fabric Forum: XConn CXL Switches for AI
Q1 Memory Fabric Forum: XConn CXL Switches for AIQ1 Memory Fabric Forum: XConn CXL Switches for AI
Q1 Memory Fabric Forum: XConn CXL Switches for AIMemory Fabric Forum
 
Zi-Stick UBS Dongle ZIgbee from Aeotec manual
Zi-Stick UBS Dongle ZIgbee from  Aeotec manualZi-Stick UBS Dongle ZIgbee from  Aeotec manual
Zi-Stick UBS Dongle ZIgbee from Aeotec manualDomotica daVinci
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdf
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdfLLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdf
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdfThomas Poetter
 
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptx
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptxEvolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptx
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptxKyle Willson
 
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFEDNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFEandreiandasan
 
Tete thermostatique Zigbee MOES BRT-100 V2.pdf
Tete thermostatique Zigbee MOES BRT-100 V2.pdfTete thermostatique Zigbee MOES BRT-100 V2.pdf
Tete thermostatique Zigbee MOES BRT-100 V2.pdfDomotica daVinci
 
Introduction to Serverless with AWS Lambda in C#.pptx
Introduction to Serverless with AWS Lambda in C#.pptxIntroduction to Serverless with AWS Lambda in C#.pptx
Introduction to Serverless with AWS Lambda in C#.pptxBrandon Minnick, MBA
 
Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringMassimo Talia
 
M.Aathiraju Self Intro.docx-AD21001_____
M.Aathiraju Self Intro.docx-AD21001_____M.Aathiraju Self Intro.docx-AD21001_____
M.Aathiraju Self Intro.docx-AD21001_____Aathiraju
 
Azure Migration Guide for IT Professionals
Azure Migration Guide for IT ProfessionalsAzure Migration Guide for IT Professionals
Azure Migration Guide for IT ProfessionalsChristine Shepherd
 
Microsoft Azure News - Feb 2024
Microsoft Azure News - Feb 2024Microsoft Azure News - Feb 2024
Microsoft Azure News - Feb 2024Daniel Toomey
 
2024 February Patch Tuesday
2024 February Patch Tuesday2024 February Patch Tuesday
2024 February Patch TuesdayIvanti
 
Breaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI TechnologyBreaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI TechnologySafe Software
 
AUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMING
AUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMINGAUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMING
AUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMINGLiveplex
 
Automate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center ExcellenceAutomate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center ExcellencePrecisely
 

Recently uploaded (20)

COE AI Lab Universities
COE AI Lab UniversitiesCOE AI Lab Universities
COE AI Lab Universities
 
Enhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for PartnersEnhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for Partners
 
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...
 
AWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user groupAWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user group
 
Put a flag on it. A busy developer's guide to feature toggles.
Put a flag on it. A busy developer's guide to feature toggles.Put a flag on it. A busy developer's guide to feature toggles.
Put a flag on it. A busy developer's guide to feature toggles.
 
Q1 Memory Fabric Forum: XConn CXL Switches for AI
Q1 Memory Fabric Forum: XConn CXL Switches for AIQ1 Memory Fabric Forum: XConn CXL Switches for AI
Q1 Memory Fabric Forum: XConn CXL Switches for AI
 
Zi-Stick UBS Dongle ZIgbee from Aeotec manual
Zi-Stick UBS Dongle ZIgbee from  Aeotec manualZi-Stick UBS Dongle ZIgbee from  Aeotec manual
Zi-Stick UBS Dongle ZIgbee from Aeotec manual
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdf
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdfLLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdf
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI.pdf
 
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptx
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptxEvolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptx
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptx
 
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFEDNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
 
Tete thermostatique Zigbee MOES BRT-100 V2.pdf
Tete thermostatique Zigbee MOES BRT-100 V2.pdfTete thermostatique Zigbee MOES BRT-100 V2.pdf
Tete thermostatique Zigbee MOES BRT-100 V2.pdf
 
Introduction to Serverless with AWS Lambda in C#.pptx
Introduction to Serverless with AWS Lambda in C#.pptxIntroduction to Serverless with AWS Lambda in C#.pptx
Introduction to Serverless with AWS Lambda in C#.pptx
 
Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineering
 
M.Aathiraju Self Intro.docx-AD21001_____
M.Aathiraju Self Intro.docx-AD21001_____M.Aathiraju Self Intro.docx-AD21001_____
M.Aathiraju Self Intro.docx-AD21001_____
 
Azure Migration Guide for IT Professionals
Azure Migration Guide for IT ProfessionalsAzure Migration Guide for IT Professionals
Azure Migration Guide for IT Professionals
 
Microsoft Azure News - Feb 2024
Microsoft Azure News - Feb 2024Microsoft Azure News - Feb 2024
Microsoft Azure News - Feb 2024
 
2024 February Patch Tuesday
2024 February Patch Tuesday2024 February Patch Tuesday
2024 February Patch Tuesday
 
Breaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI TechnologyBreaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI Technology
 
AUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMING
AUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMINGAUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMING
AUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMING
 
Automate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center ExcellenceAutomate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center Excellence
 

Transforming Data Architecture Complexity at Sears - StampedeCon 2013

  • 1. 1   Transforming  Data  Architecture   Complexity  at  Sears   Jus:n  Sheppard   Sears  Holdings  Corpora1on  
  • 2. 2     •  Not  mee1ng  produc1on  schedules   •  Mul1ple  copies  of  data,  no  single  point  of  truth   •  ETL  complexity,  cost  of  soAware  and  cost  to  manage   •  Time  to  setup  ETL  data  sources  for  projects   •  Latency  in  data  (up  to  weeks  in  some  cases)   •  Enterprise  Data  Warehouses  unable  to  handle  load   •  Mainframe  workload  over  consuming  capacity   •  IT  Budgets  not  growing  –  BUT  data  volumes  escala1ng   Where  Did  We  Start?  
  • 3. What  Is  Hadoop?   3   Hadoop  Distributed   File  System  (HDFS)     File  Sharing  &  Data   Protec1on  Across   Physical  Servers   MapReduce     Fault  Tolerant   Distributed   Compu1ng  Across   Physical  Servers   Flexibility     o A  single  repository  for   storing  processing  &   analyzing  any  type  of  data   (structured  and  complex)   o Not  bound  by  a  single   schema   Scalability     o Scale-­‐out  architecture  divides   workloads  across  mul1ple   nodes   o Flexible  file  system  eliminates   ETL  boXlenecks   Low  Cost     o Can  be  deployed  on   commodity  hardware   o Open  source  plaZorm  guards   against  vendor  lock   Hadoop  is  a  plaZorm  for  data  storage   and  processing  that  is…   o  Scalable   o  Fault  tolerant   o  Open  source  
  • 4. 4   Hadoop   IS   •  Store  vast  amounts  of  data   •  Run  queries  on  huge  data   sets   •  Ask  ques1ons  previously   impossible   •  Archive  data  but  s1ll   analyze  it   •  Capture  data  streams  at   incredible  speeds   •  Massively  reduce  data   latency   •  Transform  your  thinking   about  ETL   Is  Not   •  High-­‐speed  SQL  database   •  Simple   •  Easily  connected  to  legacy   systems   •  A  replacement  for  your   current  data  warehouse   •  Going  to  be  built  or   operated  by  your  DBA's   •  Going  to  make  any  sense   to  your  data  architects   •  Going  to  be  possible  if  do   not  have  Linux  skills  
  • 5. 5   Use  The  Right  Tool  For  The  Right  Job   Databases:   Hadoop:   When to use? •  Affordable Storage/Compute •  High-performance queries on large data •  Complex data •  Resilient Auto Scalability When to use? •  Transactional, High Speed Analytics •  Interactive Reporting (<1sec) •  Multi-step Transactions •  Numerous Inserts/Updates/Deletes Can be combined
  • 6. Use  The  Right  Tool  For  The  Right  Job   6   Hadoop Database
  • 7. Data  Hub   7   •  Underlying  premise  as  Hadoop  adop1on  con1nues  –  source  data  once,  use  many.   •  Over  1me,  as  more  and  more  data  is  sourced,  development  1mes  will  reduce  since  data   sourcing  is  significantly  less  than  typical.  
  • 8. 8   Some  Examples   Use-­‐cases  at  Sears  Holdings  
  • 9. The  First  Usage  in  Produc1on   Use  Case     •  Interac1ve  presenta1on  layer  was  required  to  present  item/price/sales  data  in  a  highly  flexible  user   interface  with  rapid  response  1me     •  Needed  to  deliver  solu1on  within  a  very  short  period  of  1me.     •  Legacy  architecture  would  have  required  a  MicroStrategy  solu1on  u1lizing  1,000’s  of  cubes  on   many  expensive  servers     Approach     •  Rapid  development  project  ini1ated  to  present  item/price/sales  data  in  a  highly  flexible  user   interface  with  rapid  response  1me     •  Built  system  from  the  ground  up     •  Migrated  all  required  data  to  centralized  HDFS  repository  from  legacy  databases     •  Developed  MapReduce  code  to  process  daily  data  files  into  4  primary  data  tables     •  Tables  extracted  to  service  layer  (MySQL/Infobrite)  for  presenta1on  through  the  Pricing  Portal     Results     •  File  prepara1on  completes  in  minutes  each  day  and  ensures  portal  data  is  ready  very  soon  aAer   daily  sales  processing  completes  (100K  records  daily)     •  This  was  the  first  produc1on  usage  of  MapReduce  and  associated  technologies  –  the  project   ini1ated  in  March  and  was  live  on  May  9  (<10  weeks  concept  to  realiza1on)     Technologies  Used     •  Hadoop,  Hive,  MapReduce,  MySql,  Infobright,  Linux,  REST  Web  Service,  Dotnetnuke     9   Learning  experience  for  all  par1es,  successfully  demonstrated  plaZorm  abili1es  in   produc1on  environment  –  but  we  would  NOT  do  it  this  way  again…  
  • 10. Mainframe  Migra1on   10   Step 1 Source 1 Source 2 Step 2 Step 3 Step 4 Step 5 Source 3 Source 4 Output As  our  experience  with  Hadoop  increased,  hypothesis  were  formed  that  the   technology  could  aid  with  SHC’s  mainframe  migra1on  ini1a1ve.   Example  above  represents  a  simply  mainframe  process   Step 1 Source 1 Source 2 Step 2 Step 3 Step 4 Step 5 Source 3 Source 4 Output Step 4 Step 5 X X Migrated  sec1ons  of  mainframe  processing,  including   data  transfer  to  Hadoop  and  back,  elimina1ng  MIPS   and  IMPROVING  overall  cycle  1me  
  • 11. ETL  Replacement   •  A  major  ongoing  system  effort  in  our  Marke1ng  department   was  heavily  reliant  on  DataStage  processing  for  ETL     –  In  the  early  stages  of  deployment  the  ETL  plaZorm  performed  within   acceptable  limits   –  As  volume  increased  the  system  began  to  have  performance  issues  as   the  ETL  plaZorm  degraded   –  With  full  rollout  imminent,  the  op1ons  were  to  heavily  invest  in   addi1onal  hardware  –  or  –  re-­‐work  CPU-­‐intensive  por1ons  in  Hadoop   11   •  Experience  with  mainframe  migra1on  evolved  to  ETL  replacement  .   •  SHC  successfully  demonstrated  reducing  load  on  costly  ETL  soAware  with  PiG   scripts  (and  data  movement  from  /  to  ETL  plaZorm  as  an  intermediate  step).   •  AND  with  improved  processing  1me…  
  • 12. The  Journey   •  From  Legacy  (>  1000  lines)  to  Ruby  /  MapReduce  (400  lines)   –  Cryp1c  code,  difficult  to  support,  difficult  to  train     •  We  tried  HIVE  (~400  lines  -­‐  Sql-­‐like  abstrac1on)   –  Easy  to  use,  easy  to  experiment  and  test  with   –  Poor  performance,  difficult  to  implement  business  logic     •  We  evolved  to  PiG  with  Java  UDF  extensions   –  Compressed,  very  efficient,  easy  to  code  /  read  (~200  lines)   –  Demonstrated  success  in  transforming  mainframe  developers  to  PiG  developers  in  under  2  weeks     •  As  we  progressed,  our  business  partners  requested  more  and  more  data  from  the  cluster  –   which  required  developer  1me   –  We  are  now  using  Datameer  as  a  business-­‐user  repor1ng  and  query  front-­‐end  to  the  cluster   –  Developer  for  Hadoop,  runs  efficiently,  flexible  spreadsheet  interface  with  dashboards   12   We  are  in  a  much  different  place  now  than  when  we  started  our  Hadoop  journey.  
  • 13. 13   The  Learning  HADOOP   ü  We  can  drama1cally  reduce  batch  processing  1mes  for  mainframe  and  EDW   ü  We  can  retain  and  analyze  data  at  a  much  more  granular  level,  with  longer  history     ü  Hadoop  must  be  part  of  an  overall  solu1on  and  eco-­‐system   IMPLEMENTATION   ü  We  can  reliably  meet  our  produc1on  deliverable  1me-­‐windows  by  using  Hadoop   ü  We  can  largely  eliminate  the  use  of  tradi1onal  ETL  tools   ü  New  Tools  allow  improved  user  experience  on  very  large  data  sets   ü  We  developed  tools  and  skills  –  The  learning  curve  is  not  to  be  underes1mated   ü  We  developed  experience  in  moving  workload  from  expensive,  proprietary   mainframe  and  EDW  plaZorms  to  Hadoop  with  spectacular  results   UNIQUE  VALUE   Over  three  years  of  experience  using  Hadoop  for  enterprise   legacy  workload.    
  • 14. Thank You! For  further  informa1on   email:   visit:   contact@metascale.com www.metascale.com