Your SlideShare is downloading. ×
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Powerful Analytics Apps Fuled by Hadoop
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Powerful Analytics Apps Fuled by Hadoop

1,044

Published on

It’s becoming a new era for cross-commerce. eBay …

It’s becoming a new era for cross-commerce. eBay
Marketplace creates a powerful online platform
for sale of goods and services by a passionate
community of individuals and small businesses. It
has full suite of analytics applications aka tracking,
experimentation and personalization to understand
traffic from various channels i.e. mobile, local and
social. This presentation highlights various architectural
components and design of high performance, concurrent
and distributed applications built using Hadoop.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,044
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Powerful Analytics Apps fueled byHadoop for High Performance andScalabilityAmit RustagiMar 21st, 2013
  • 2. AgendaAbout eBayAnalytics Apps at eBayHadoop at eBay
  • 3. ABOUT EBAY
  • 4. $67billionin merchandise sold in 2012 Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 4
  • 5. The world’s largest onlinemarketplace – where practicallyanyone can trade practicallyanything at anytime. Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 5
  • 6. eBay Marketplaces102+ millionactive buyers and sellers worldwide350+ million itemin more than 50000 categories3+ billion page viewseach day300+ million querieseach day to our search engine Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 6
  • 7. Huge Opportunity: Taking the “e” out of commerce Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 7
  • 8. ANALYTICS APPSAT EBAY
  • 9. Data is gold Query Clicks Logs Buyers Performance Crawled Sellers data User Images History Items / Feedback Products Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 9
  • 10. Goals for Analytics AppsScalability FlexibilityCost Effectiveness Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 10
  • 11. Analytics Apps Lifecycle BUILD KPIs Implement Define Goals Personalization Experimentation Experimen t Analysis Web Analytics Collect Data Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 11
  • 12. HADOOP AT EBAY
  • 13. Why Hadoop?•Scalability  Simply scales by adding nodes  Local processing to avoid network bottlenecks•Flexibility  All kinds of data (blobs, documents, records, etc.)  In all forms (structured, semi-structured, unstructured)  Store anything then later analyze what you need•Efficiency  Cost efficiency (<$1k/1TB) on commodity hardware  Unified storage, metadata, security (no duplication or synchronization) Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 13
  • 14. A brief history of time 2013 2012 • Shared clusters • Shared • 4k+ node 2011 clusters • 40,000+ core • 1000s node • Shared • 50s PB • 10,000+ core clusters 2010 • 10s PB • 1000s node Shared • 10,000+ core cluster • 10s PB 2009 • Wilma (0.20) • 100s nodes Search • 1000s + • 10s- core nodes • PB 2007 • CDH2 Single digit nodes Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 14
  • 15. Hadoop Data Platform at eBay Tools Clients Data Catalog ETL Monitor Java Pig Mobius Metadata Mgmt User Mgmt Scala Hive Cascading Data Ingest Data Access Extract Transform Java POJO Hive UDF Load Validate Pig UDF Metadata Metastore Type System API Service Hadoop Behavioral Transactional Inventory Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 15
  • 16. Analytics Ecosystem Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 16
  • 17. Hadoop in Web Analytics Map Reduce Map ReduceGENERATION Stage 1 Stage 2 Behavioral Reporting Sessions MetricsMETRICS MySQL Pig Enriched Event Adhoc Metrics Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 17
  • 18. Hadoop in Personalization Personalization DB PDS Data Importer / Exporter Files Files Data Transfer Statistical Analysis Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 18
  • 19. Questions ? Powerful Analytics Apps Fueled by Hadoop for High Performance and Scalability 19

×