Powerful Analytics Apps fueled byHadoop for High Performance andScalabilityAmit RustagiMar 21st, 2013
AgendaAbout eBayAnalytics Apps at eBayHadoop at eBay
ABOUT EBAY
$67billionin merchandise sold in 2012                       Powerful Analytics Apps Fueled by Hadoop for High Performance ...
The world’s largest onlinemarketplace – where practicallyanyone can trade practicallyanything at anytime.           Powerf...
eBay Marketplaces102+ millionactive buyers and sellers worldwide350+ million itemin more than 50000 categories3+ billion p...
Huge Opportunity: Taking the “e” out of commerce                   Powerful Analytics Apps Fueled by Hadoop for High Perfo...
ANALYTICS APPSAT EBAY
Data is gold                                                 Query               Clicks                                   ...
Goals for Analytics AppsScalability                                                     FlexibilityCost Effectiveness     ...
Analytics Apps Lifecycle                 BUILD KPIs                         Implement  Define Goals                       ...
HADOOP AT EBAY
Why Hadoop?•Scalability    Simply scales by adding nodes    Local processing to avoid network bottlenecks•Flexibility   ...
A brief history of time                                                                                           2013    ...
Hadoop Data Platform at eBay     Tools                                     Clients     Data Catalog     ETL Monitor       ...
Analytics Ecosystem                  Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability   16
Hadoop in Web Analytics                        Map Reduce   Map ReduceGENERATION                          Stage 1       St...
Hadoop in Personalization              Personalization              DB                                  PDS               ...
Questions ?              Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability   19
Upcoming SlideShare
Loading in …5
×

Powerful Analytics Apps Fuled by Hadoop

1,210 views
1,132 views

Published on

It’s becoming a new era for cross-commerce. eBay
Marketplace creates a powerful online platform
for sale of goods and services by a passionate
community of individuals and small businesses. It
has full suite of analytics applications aka tracking,
experimentation and personalization to understand
traffic from various channels i.e. mobile, local and
social. This presentation highlights various architectural
components and design of high performance, concurrent
and distributed applications built using Hadoop.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,210
On SlideShare
0
From Embeds
0
Number of Embeds
79
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Powerful Analytics Apps Fuled by Hadoop

  1. 1. Powerful Analytics Apps fueled byHadoop for High Performance andScalabilityAmit RustagiMar 21st, 2013
  2. 2. AgendaAbout eBayAnalytics Apps at eBayHadoop at eBay
  3. 3. ABOUT EBAY
  4. 4. $67billionin merchandise sold in 2012 Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 4
  5. 5. The world’s largest onlinemarketplace – where practicallyanyone can trade practicallyanything at anytime. Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 5
  6. 6. eBay Marketplaces102+ millionactive buyers and sellers worldwide350+ million itemin more than 50000 categories3+ billion page viewseach day300+ million querieseach day to our search engine Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 6
  7. 7. Huge Opportunity: Taking the “e” out of commerce Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 7
  8. 8. ANALYTICS APPSAT EBAY
  9. 9. Data is gold Query Clicks Logs Buyers Performance Crawled Sellers data User Images History Items / Feedback Products Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 9
  10. 10. Goals for Analytics AppsScalability FlexibilityCost Effectiveness Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 10
  11. 11. Analytics Apps Lifecycle BUILD KPIs Implement Define Goals Personalization Experimentation Experimen t Analysis Web Analytics Collect Data Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 11
  12. 12. HADOOP AT EBAY
  13. 13. Why Hadoop?•Scalability  Simply scales by adding nodes  Local processing to avoid network bottlenecks•Flexibility  All kinds of data (blobs, documents, records, etc.)  In all forms (structured, semi-structured, unstructured)  Store anything then later analyze what you need•Efficiency  Cost efficiency (<$1k/1TB) on commodity hardware  Unified storage, metadata, security (no duplication or synchronization) Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 13
  14. 14. A brief history of time 2013 2012 • Shared clusters • Shared • 4k+ node 2011 clusters • 40,000+ core • 1000s node • Shared • 50s PB • 10,000+ core clusters 2010 • 10s PB • 1000s node Shared • 10,000+ core cluster • 10s PB 2009 • Wilma (0.20) • 100s nodes Search • 1000s + • 10s- core nodes • PB 2007 • CDH2 Single digit nodes Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 14
  15. 15. Hadoop Data Platform at eBay Tools Clients Data Catalog ETL Monitor Java Pig Mobius Metadata Mgmt User Mgmt Scala Hive Cascading Data Ingest Data Access Extract Transform Java POJO Hive UDF Load Validate Pig UDF Metadata Metastore Type System API Service Hadoop Behavioral Transactional Inventory Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 15
  16. 16. Analytics Ecosystem Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 16
  17. 17. Hadoop in Web Analytics Map Reduce Map ReduceGENERATION Stage 1 Stage 2 Behavioral Reporting Sessions MetricsMETRICS MySQL Pig Enriched Event Adhoc Metrics Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 17
  18. 18. Hadoop in Personalization Personalization DB PDS Data Importer / Exporter Files Files Data Transfer Statistical Analysis Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 18
  19. 19. Questions ? Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 19

×