SlideShare a Scribd company logo
Addressing Big Data Challenges: The Hadoop Way 
Presented by: Atul Dambalkar
Agenda Big Data Challenges Big Data Analytics Industry Trends Hadoop as a Solution Real Life Solution Studies 
•Case Study I - Retail Industry 
•Case Study II - Online Advertising Industry How Xoriant can help? Q & A
Big Data Challenges 
THE FOUR V’s OF BIG DATA 
Source: IBM
Traditional Approach & Its Limitations 
Data Warehousing Vendors (ETL) 
Costs - High Initial Setup, 
Maintenance, Subscription 
or Licensing Fees 
No support for 
unstructured data 
Multiple copies of data in different formats 
Data latency and 
bottlenecks 
No support for ad-hoc query 
Note: The Logos are proprietary of the individual companies
Big Data and Analytics - Trends 
Enterprise Data Hub or Data Lake (Hadoop with HDFS) 
Commodity Hardware 
No multiple data copies 
Fault-Tolerant storage for Raw data As-Is 
Current Limitations - Write/Append only, No Delete or Update 
Unified Data Access 
Multiple Data Processing Paradigms 
In-memory processing 
In-memory, 
Real-time Stream processing 
Analytics based on Distributed SQL Processing 
Falling hardware prices 
Batch mode processing for data size more than Hundreds of TBs 
In-memory processing for data size less than hundreds of TBs 
ETL Trends 
Data 
Processing 
Trends 
Architecture 
Trends 
Open Source Software
Hadoop Proposition 
Open Source Ecosystem 
No data loss through replicated storage (HDFS) 
Runs on commodity hardware 
•Map-Reduce 
•Script based (Pig Latin) 
•SQL like - HiveQL, Apache Drill, Presto (Facebook) 
•Impala (Cloudera), HAWQ (Pivotal) 
•In-memory processing (Apache Spark) 
Multiple data analysis/processing paradigms
Hadoop Data Flow
Apache - Hadoop Ecosystem
Case Study - 1 
Benefits 
Retail Industry 
Problem Scenario 
Personalize marketing campaigns, coupons, offers, marking down inventories 
Improving customer loyalty – leads to sales and profitability 
Competition from other retailers 
ETL based analysis tasks - taking lot of time – up to 6 weeks 
Software systems (Oracle, Greenplum, SAS, Teradata) 
Mainframe based expensive hardware systems 
Hadoop based Solution 
Data stored into HDFS with replication 
300 Hadoop nodes with 2PB data 
Data processing time down to 1 week and even daily 
Mainframe cost savings 
No software licensing costs 
Limitless data storage with HDFS 
No multiple data copies 
Low cost
Case Study - 2 
Benefits 
Attribution computation time down to 45 minutes 
Capable of processing up to 300GB data for each computation 
Manageable data storage with HDFS 
Low cost 
Online Advertising Industry (Attribution Computation) 
Problem Scenario 
Growing Ads Impression and conversion events 
Longer attribution computation time (6 to 8 hours for each computation run). Advertisers needed quick results 
Unable to process more than 150GB data within each computation 
IBM Netezza based solution along with Oracle 
Expensive hardware and software costs 
Hadoop based Solution 
Data stored into HDFS with replication 
Initially used HiveQL then moved to Cloudera Impala (MPP architecture based Distributed SQL Engine)
Xoriant Big Data Practice - Overview Understands technological needs and organizational challenges faced with respect to Big Data Understands rapidly evolving Big Data technology space Can help bridge the gaps with Big Data capabilities Brings Big Data and NoSQL technology expertise
Xoriant – Big Data Center of Excellence 
Email: bigdata@xoriant.com 
For FREE consultation, please contact us on the above mentioned email address. 
Do you have any Questions?

More Related Content

What's hot

Everest Group FIT matrix for Robotic Process Automation (rpa) technology
Everest Group FIT matrix for Robotic Process Automation (rpa) technologyEverest Group FIT matrix for Robotic Process Automation (rpa) technology
Everest Group FIT matrix for Robotic Process Automation (rpa) technology
UiPath
 
RpA introduction
RpA introductionRpA introduction
RpA introduction
Chandrasekhar Telkapalli
 
Gateways to Power BI, Connect PowerBI.com to your On-Prem Data
Gateways to Power BI, Connect PowerBI.com to your On-Prem DataGateways to Power BI, Connect PowerBI.com to your On-Prem Data
Gateways to Power BI, Connect PowerBI.com to your On-Prem Data
Jean-Pierre Riehl
 
Capgemini Consulting RPA: the next revolution of Corporate Functions
Capgemini Consulting RPA: the next revolution of Corporate FunctionsCapgemini Consulting RPA: the next revolution of Corporate Functions
Capgemini Consulting RPA: the next revolution of Corporate Functions
UiPath
 
NoSql
NoSqlNoSql
RPA.pptx
RPA.pptxRPA.pptx
The Business Case for Robotic Process Automation (RPA)
The Business Case for Robotic Process Automation (RPA)The Business Case for Robotic Process Automation (RPA)
The Business Case for Robotic Process Automation (RPA)
Joe Tawfik
 
Rpa ai automation webinar by new, cfgi, ui path 11 82018
Rpa ai automation webinar by new, cfgi, ui path 11 82018Rpa ai automation webinar by new, cfgi, ui path 11 82018
Rpa ai automation webinar by new, cfgi, ui path 11 82018
Bob Fitzpatrick
 
Intelligent Document Processing
Intelligent Document ProcessingIntelligent Document Processing
Intelligent Document Processing
Infrrd
 
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Edureka!
 
What is data engineering?
What is data engineering?What is data engineering?
What is data engineering?
yongdam kim
 
HyperAutomation (3).pptx
HyperAutomation (3).pptxHyperAutomation (3).pptx
HyperAutomation (3).pptx
MeetPBarasara
 
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiApache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Slim Baltagi
 
Data Infra and Data Access in Nubank
Data Infra and Data Access in NubankData Infra and Data Access in Nubank
Data Infra and Data Access in Nubank
André de Lannoy Tavares
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika Kotecha
Radhika Kotecha
 
Data Warehousing - in the real world
Data Warehousing - in the real worldData Warehousing - in the real world
Data Warehousing - in the real world
ukc4
 
Oracle Analytics Cloud
Oracle Analytics CloudOracle Analytics Cloud
Oracle Analytics Cloud
Joseph Alaimo Jr
 
Data lake
Data lakeData lake
Data lake
GHAZOUANI WAEL
 
Data Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation CriteriaData Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation Criteria
ScyllaDB
 
Introduction To UiPath | RPA Tutorial For Beginners | RPA Training using Uipa...
Introduction To UiPath | RPA Tutorial For Beginners | RPA Training using Uipa...Introduction To UiPath | RPA Tutorial For Beginners | RPA Training using Uipa...
Introduction To UiPath | RPA Tutorial For Beginners | RPA Training using Uipa...
Edureka!
 

What's hot (20)

Everest Group FIT matrix for Robotic Process Automation (rpa) technology
Everest Group FIT matrix for Robotic Process Automation (rpa) technologyEverest Group FIT matrix for Robotic Process Automation (rpa) technology
Everest Group FIT matrix for Robotic Process Automation (rpa) technology
 
RpA introduction
RpA introductionRpA introduction
RpA introduction
 
Gateways to Power BI, Connect PowerBI.com to your On-Prem Data
Gateways to Power BI, Connect PowerBI.com to your On-Prem DataGateways to Power BI, Connect PowerBI.com to your On-Prem Data
Gateways to Power BI, Connect PowerBI.com to your On-Prem Data
 
Capgemini Consulting RPA: the next revolution of Corporate Functions
Capgemini Consulting RPA: the next revolution of Corporate FunctionsCapgemini Consulting RPA: the next revolution of Corporate Functions
Capgemini Consulting RPA: the next revolution of Corporate Functions
 
NoSql
NoSqlNoSql
NoSql
 
RPA.pptx
RPA.pptxRPA.pptx
RPA.pptx
 
The Business Case for Robotic Process Automation (RPA)
The Business Case for Robotic Process Automation (RPA)The Business Case for Robotic Process Automation (RPA)
The Business Case for Robotic Process Automation (RPA)
 
Rpa ai automation webinar by new, cfgi, ui path 11 82018
Rpa ai automation webinar by new, cfgi, ui path 11 82018Rpa ai automation webinar by new, cfgi, ui path 11 82018
Rpa ai automation webinar by new, cfgi, ui path 11 82018
 
Intelligent Document Processing
Intelligent Document ProcessingIntelligent Document Processing
Intelligent Document Processing
 
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
 
What is data engineering?
What is data engineering?What is data engineering?
What is data engineering?
 
HyperAutomation (3).pptx
HyperAutomation (3).pptxHyperAutomation (3).pptx
HyperAutomation (3).pptx
 
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiApache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
 
Data Infra and Data Access in Nubank
Data Infra and Data Access in NubankData Infra and Data Access in Nubank
Data Infra and Data Access in Nubank
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika Kotecha
 
Data Warehousing - in the real world
Data Warehousing - in the real worldData Warehousing - in the real world
Data Warehousing - in the real world
 
Oracle Analytics Cloud
Oracle Analytics CloudOracle Analytics Cloud
Oracle Analytics Cloud
 
Data lake
Data lakeData lake
Data lake
 
Data Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation CriteriaData Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation Criteria
 
Introduction To UiPath | RPA Tutorial For Beginners | RPA Training using Uipa...
Introduction To UiPath | RPA Tutorial For Beginners | RPA Training using Uipa...Introduction To UiPath | RPA Tutorial For Beginners | RPA Training using Uipa...
Introduction To UiPath | RPA Tutorial For Beginners | RPA Training using Uipa...
 

Similar to Addressing Big Data Challenges - The Hadoop Way

Deutsche Telekom on Big Data
Deutsche Telekom on Big DataDeutsche Telekom on Big Data
Deutsche Telekom on Big Data
DataWorks Summit
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter Point
Inside Analysis
 
Exploring the Wider World of Big Data
Exploring the Wider World of Big DataExploring the Wider World of Big Data
Exploring the Wider World of Big Data
NetApp
 
Retail & CPG
Retail & CPGRetail & CPG
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data Warehouse
Cloudera, Inc.
 
Exploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisExploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis Kapsalis
NetAppUK
 
Customer value analysis of big data products
Customer value analysis of big data productsCustomer value analysis of big data products
Customer value analysis of big data products
Vikas Sardana
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
Attunity
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKRajesh Jayarman
 
When Databases Meet Big data and Hadoop - Uni of Tromso Online Lecture
When Databases Meet Big data and Hadoop - Uni of Tromso Online LectureWhen Databases Meet Big data and Hadoop - Uni of Tromso Online Lecture
When Databases Meet Big data and Hadoop - Uni of Tromso Online Lecture
Irfan Elahi
 
Hadoop - An Introduction
Hadoop - An IntroductionHadoop - An Introduction
Hadoop - An Introduction
Shankar R
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & Hadoop
Blackvard
 
FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)
FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)
FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)
GeeksLab Odessa
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
kcmallu
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop Solution
Hitachi Vantara
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data Architecture
Perficient, Inc.
 
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
email2jl
 
Hadoop & Data Warehouse
Hadoop & Data Warehouse Hadoop & Data Warehouse
Hadoop & Data Warehouse
Mohit Srivastava
 
Stratebi Big Data
Stratebi Big DataStratebi Big Data
Stratebi Big Data
Stratebi
 

Similar to Addressing Big Data Challenges - The Hadoop Way (20)

Deutsche Telekom on Big Data
Deutsche Telekom on Big DataDeutsche Telekom on Big Data
Deutsche Telekom on Big Data
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter Point
 
Exploring the Wider World of Big Data
Exploring the Wider World of Big DataExploring the Wider World of Big Data
Exploring the Wider World of Big Data
 
Retail & CPG
Retail & CPGRetail & CPG
Retail & CPG
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data Warehouse
 
Exploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisExploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis Kapsalis
 
Customer value analysis of big data products
Customer value analysis of big data productsCustomer value analysis of big data products
Customer value analysis of big data products
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
When Databases Meet Big data and Hadoop - Uni of Tromso Online Lecture
When Databases Meet Big data and Hadoop - Uni of Tromso Online LectureWhen Databases Meet Big data and Hadoop - Uni of Tromso Online Lecture
When Databases Meet Big data and Hadoop - Uni of Tromso Online Lecture
 
Beyond TCO
Beyond TCOBeyond TCO
Beyond TCO
 
Hadoop - An Introduction
Hadoop - An IntroductionHadoop - An Introduction
Hadoop - An Introduction
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & Hadoop
 
FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)
FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)
FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop Solution
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data Architecture
 
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
 
Hadoop & Data Warehouse
Hadoop & Data Warehouse Hadoop & Data Warehouse
Hadoop & Data Warehouse
 
Stratebi Big Data
Stratebi Big DataStratebi Big Data
Stratebi Big Data
 

More from Xoriant Corporation

Webinar: Unlocking the potential of io t data
Webinar: Unlocking the potential of io t dataWebinar: Unlocking the potential of io t data
Webinar: Unlocking the potential of io t data
Xoriant Corporation
 
Xoriant - Financial services expertise
Xoriant - Financial services expertiseXoriant - Financial services expertise
Xoriant - Financial services expertise
Xoriant Corporation
 
Xoriant Smartphone apps accelerator
Xoriant Smartphone apps acceleratorXoriant Smartphone apps accelerator
Xoriant Smartphone apps accelerator
Xoriant Corporation
 
Mobile porting and testing - Xoriant
Mobile porting and testing - Xoriant Mobile porting and testing - Xoriant
Mobile porting and testing - Xoriant
Xoriant Corporation
 
SEP Webinar –HTML5: The GenX Technology for building scalable and high perfor...
SEP Webinar –HTML5: The GenX Technology for building scalable and high perfor...SEP Webinar –HTML5: The GenX Technology for building scalable and high perfor...
SEP Webinar –HTML5: The GenX Technology for building scalable and high perfor...
Xoriant Corporation
 
Staying the Course
Staying the CourseStaying the Course
Staying the Course
Xoriant Corporation
 
Product Engineering Outsourcing: Looking beyond Cost Savings
Product Engineering Outsourcing: Looking beyond Cost SavingsProduct Engineering Outsourcing: Looking beyond Cost Savings
Product Engineering Outsourcing: Looking beyond Cost Savings
Xoriant Corporation
 
Growth by Partnerships for ISVs in the financial software products markets
Growth by Partnerships for ISVs in the financial software products marketsGrowth by Partnerships for ISVs in the financial software products markets
Growth by Partnerships for ISVs in the financial software products markets
Xoriant Corporation
 
Product Engineering - Distributed Agile
Product Engineering - Distributed AgileProduct Engineering - Distributed Agile
Product Engineering - Distributed AgileXoriant Corporation
 
The Xoriant Whitepaper: Last Mile Soa Implementation
The Xoriant Whitepaper: Last Mile Soa ImplementationThe Xoriant Whitepaper: Last Mile Soa Implementation
The Xoriant Whitepaper: Last Mile Soa Implementation
Xoriant Corporation
 

More from Xoriant Corporation (11)

Webinar: Unlocking the potential of io t data
Webinar: Unlocking the potential of io t dataWebinar: Unlocking the potential of io t data
Webinar: Unlocking the potential of io t data
 
Xoriant - Financial services expertise
Xoriant - Financial services expertiseXoriant - Financial services expertise
Xoriant - Financial services expertise
 
Xoriant Smartphone apps accelerator
Xoriant Smartphone apps acceleratorXoriant Smartphone apps accelerator
Xoriant Smartphone apps accelerator
 
Mobile porting and testing - Xoriant
Mobile porting and testing - Xoriant Mobile porting and testing - Xoriant
Mobile porting and testing - Xoriant
 
SEP Webinar –HTML5: The GenX Technology for building scalable and high perfor...
SEP Webinar –HTML5: The GenX Technology for building scalable and high perfor...SEP Webinar –HTML5: The GenX Technology for building scalable and high perfor...
SEP Webinar –HTML5: The GenX Technology for building scalable and high perfor...
 
Staying the Course
Staying the CourseStaying the Course
Staying the Course
 
Product Engineering Outsourcing: Looking beyond Cost Savings
Product Engineering Outsourcing: Looking beyond Cost SavingsProduct Engineering Outsourcing: Looking beyond Cost Savings
Product Engineering Outsourcing: Looking beyond Cost Savings
 
Growth by Partnerships for ISVs in the financial software products markets
Growth by Partnerships for ISVs in the financial software products marketsGrowth by Partnerships for ISVs in the financial software products markets
Growth by Partnerships for ISVs in the financial software products markets
 
Product Engineering - Distributed Agile
Product Engineering - Distributed AgileProduct Engineering - Distributed Agile
Product Engineering - Distributed Agile
 
The Xoriant Whitepaper: Last Mile Soa Implementation
The Xoriant Whitepaper: Last Mile Soa ImplementationThe Xoriant Whitepaper: Last Mile Soa Implementation
The Xoriant Whitepaper: Last Mile Soa Implementation
 
Offering For Tech Companies
Offering For Tech CompaniesOffering For Tech Companies
Offering For Tech Companies
 

Recently uploaded

Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 

Recently uploaded (20)

Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 

Addressing Big Data Challenges - The Hadoop Way

  • 1. Addressing Big Data Challenges: The Hadoop Way Presented by: Atul Dambalkar
  • 2. Agenda Big Data Challenges Big Data Analytics Industry Trends Hadoop as a Solution Real Life Solution Studies •Case Study I - Retail Industry •Case Study II - Online Advertising Industry How Xoriant can help? Q & A
  • 3. Big Data Challenges THE FOUR V’s OF BIG DATA Source: IBM
  • 4. Traditional Approach & Its Limitations Data Warehousing Vendors (ETL) Costs - High Initial Setup, Maintenance, Subscription or Licensing Fees No support for unstructured data Multiple copies of data in different formats Data latency and bottlenecks No support for ad-hoc query Note: The Logos are proprietary of the individual companies
  • 5. Big Data and Analytics - Trends Enterprise Data Hub or Data Lake (Hadoop with HDFS) Commodity Hardware No multiple data copies Fault-Tolerant storage for Raw data As-Is Current Limitations - Write/Append only, No Delete or Update Unified Data Access Multiple Data Processing Paradigms In-memory processing In-memory, Real-time Stream processing Analytics based on Distributed SQL Processing Falling hardware prices Batch mode processing for data size more than Hundreds of TBs In-memory processing for data size less than hundreds of TBs ETL Trends Data Processing Trends Architecture Trends Open Source Software
  • 6. Hadoop Proposition Open Source Ecosystem No data loss through replicated storage (HDFS) Runs on commodity hardware •Map-Reduce •Script based (Pig Latin) •SQL like - HiveQL, Apache Drill, Presto (Facebook) •Impala (Cloudera), HAWQ (Pivotal) •In-memory processing (Apache Spark) Multiple data analysis/processing paradigms
  • 8. Apache - Hadoop Ecosystem
  • 9. Case Study - 1 Benefits Retail Industry Problem Scenario Personalize marketing campaigns, coupons, offers, marking down inventories Improving customer loyalty – leads to sales and profitability Competition from other retailers ETL based analysis tasks - taking lot of time – up to 6 weeks Software systems (Oracle, Greenplum, SAS, Teradata) Mainframe based expensive hardware systems Hadoop based Solution Data stored into HDFS with replication 300 Hadoop nodes with 2PB data Data processing time down to 1 week and even daily Mainframe cost savings No software licensing costs Limitless data storage with HDFS No multiple data copies Low cost
  • 10. Case Study - 2 Benefits Attribution computation time down to 45 minutes Capable of processing up to 300GB data for each computation Manageable data storage with HDFS Low cost Online Advertising Industry (Attribution Computation) Problem Scenario Growing Ads Impression and conversion events Longer attribution computation time (6 to 8 hours for each computation run). Advertisers needed quick results Unable to process more than 150GB data within each computation IBM Netezza based solution along with Oracle Expensive hardware and software costs Hadoop based Solution Data stored into HDFS with replication Initially used HiveQL then moved to Cloudera Impala (MPP architecture based Distributed SQL Engine)
  • 11. Xoriant Big Data Practice - Overview Understands technological needs and organizational challenges faced with respect to Big Data Understands rapidly evolving Big Data technology space Can help bridge the gaps with Big Data capabilities Brings Big Data and NoSQL technology expertise
  • 12. Xoriant – Big Data Center of Excellence Email: bigdata@xoriant.com For FREE consultation, please contact us on the above mentioned email address. Do you have any Questions?