Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Big Data @ Zulily
By Echo Li, Data Engineer,
eli@zulily.com
Data Services, BI and Big Data Analytics, Zulily
Where we are
Powerful and Flexible
2
BIDS Data Platform
CUSTOMER INTERACTION POINTS
WEBSTORE MEMBER ENGAGEMENT
EVENT MANAG...
How We Do It
…powered by Hortonworks Data Platform & Google Cloud
Tableau (Visualization & Reporting) Data Services (ZATA ...
Data Processing Pipeline & Analytics
2014 zulily Proprietary and Confidential
4
Operational
Systems
External APIs
(Google,...
Our Journey…
5
Data Platform V1.0
Technology Stack:
• SQL Server
Challenges:
• Scale & Only supported
structured relationa...
Use Cases…
Use Case#1: Site & Event Funnel Analysis
Google
Cloud
Storage
Hadoop/GCE
Web
Servers
zulily
data
API
BigQuery
F...
Hadoop
/GCE
Use Case #3: Supply Chain Visibility
zulily
Sync
Others
Carriers
Google
Cloud
Storage
Order
Visibility
BigQuer...
As our Journey Continues… we need more talents !!!
Please check out our career page:
http://www.zulily.com/careers
Upcoming SlideShare
Loading in …5
×

Big data at zulily

1,359 views

Published on

Presentation from Echo Li of zulily's BI Data Services team at the January Women Who Code event hosted by zulily

Published in: Data & Analytics
  • Be the first to comment

Big data at zulily

  1. 1. Big Data @ Zulily By Echo Li, Data Engineer, eli@zulily.com Data Services, BI and Big Data Analytics, Zulily
  2. 2. Where we are Powerful and Flexible 2 BIDS Data Platform CUSTOMER INTERACTION POINTS WEBSTORE MEMBER ENGAGEMENT EVENT MANAGEMENT VENDOR MANAGEMENT SUPPLY CHAIN ERP & BACK OFFICE Site Mobile Orders & Payments Content Mgmt. CRM Relevancy (personalization) Messaging Offers & Promotions Item Master Catalog & Event Workflow Mgmt. Planning PortalEDI / Data Exchange Purchase Orders Workflow & Tools Order Mgmt. Fulfillment Mgmt. Transportation Warehouse & Inventory Mgmt. Financial (SAP Enterprise) Business Intelligence HRIS Warehouse Automation Initiatives: • Capacity & Scale • Data driven decision making – Data for Everyone • Better customer experience through Personalization & Targeting
  3. 3. How We Do It …powered by Hortonworks Data Platform & Google Cloud Tableau (Visualization & Reporting) Data Services (ZATA API) Google BigQuery Big Data Platform - Google Compute Engine Hortonworks Data Platform 2.1 on Google Cloud HDFS YARN HIVE/TEZ AMBARI Google Cloud Storage Platform Tools (zulily Build) ZuSync (ETL) ZuScheduler (Scheduling) ZuMon (Data Monitoring) Customer Data Mart Merch DataMart Supply Chain DataMart Clickstream/Web Analytics
  4. 4. Data Processing Pipeline & Analytics 2014 zulily Proprietary and Confidential 4 Operational Systems External APIs (Google, FB, Yahoo, Bing etc) Hadoop Processing in Cloud Real Time ZuSync Landing Zone(LZ) Staging(stg) AtomicData Store(ADS) Aggregated Dataset Tier 1 ETLWF Tier 2 ETLWF Cust ADS Order ADS Clickstrea m Big Query Tables
  5. 5. Our Journey… 5 Data Platform V1.0 Technology Stack: • SQL Server Challenges: • Scale & Only supported structured relational data Advantages: • Simple • All data in same data store • Makes it easy for visualization, analytics and reporting Data Platform V2.0 Technology Stack: • SQL Server, Apache Hadoop Challenges: • Lack of single data store • Unable to mash up data across structured and unstructured data • Difficult to scale visualization with large scale data Advantage: • Ability to process unstructured data at scale • Tableau allows us to have single visualization layer on top of all data Modern Data Platform V3.0 Technology Stack: • Hadoop, Google Cloud Platform, Big Query Challenges: • New Pricing Model which is good and bad • Requires new data processing methodology(especially for structured data) Advantages: • Supports Scale, high Speed • Single Data Platform for structured and unstructured data • Enables scenarios which were difficult to achieve in V1.0 or V2.0 • Enterprise Hadoop capabilities enable management, monitoring and workflow definition which are critical
  6. 6. Use Cases… Use Case#1: Site & Event Funnel Analysis Google Cloud Storage Hadoop/GCE Web Servers zulily data API BigQuery Funnel Analysis ZATA(DATA API) Reporting & Analysis (Powered by Tableau) Benefits Increase Revenue Improve marketing strategy and targeting Improve business decisions
  7. 7. Hadoop /GCE Use Case #3: Supply Chain Visibility zulily Sync Others Carriers Google Cloud Storage Order Visibility BigQuery In Transit Shipment PO zulily SCS PO Shipment EDI Flat File Vendor Data Exch. Benefits End to end order visibility Manage by exception Reduce shipping costs
  8. 8. As our Journey Continues… we need more talents !!! Please check out our career page: http://www.zulily.com/careers

×