SlideShare a Scribd company logo
PAYPAL - BEHAVIORAL TRACKING ON HADOOP

ANIL MADAN
DIRECTOR OF ENGINEERING , MARKETING & ANALYTICS
PAYPAL'S VISION




           Delivering the future of money today…
   An essential part of our customer's financial and business
lives, enabling secure commerce anywhere, anytime, any way

     110 million active accounts , 190 markets , 25 currencies
                                                                 2
BEHAVIORAL TRACKING VISION
  Understand our      anytime, anywhere, any way       to drive desirable
customer’s behavior                                    outcomes for our
  and experience                                   customers and for PayPal.




                                                      Enable self-service
                                                      analytics for our
                                                      product and
                            Ensure                    marketing teams
Ensure privacy,
                            instrumentation
security and trust
                            standardization                                     3
for our customers
                            across channels
                                                                            3
TRACKING PLATFORM OVERVIEW


    Direct/            Transaction          Email            Display             Search
   Home Page              Emails           Marketing       Advertising           Engine
                                                                                Marketing




   Metadata                     Tracking Servers                   Real Time Systems
  Tracking Metadata                                                         Marketing
                                Tracking Event Service
         Tool
                                                                          Segmentation
                                     Tracking Validation
              Tag
Taxonomy                                  Service                        Experimentation
             Catalog


                              Big Data
 Reporting/Visualization               Digital Metrics                    Attribution
                                                                                            4
METADATA - ENTITY MODEL

 LAYOUT                   PAGE




   ELEMENTS                      LINK



   COMPONENTS


                                    5
METADATA - EVENT MODEL


                                      Tracking
                                       Event




                  Impression                                 Reaction                Conversion
                    Event                                     Event                    Event



Component       Page              Ad             Click       Click-Through     Mouse-over
Impression    Impression       Impression        Event           Event           Event
  Event         Event            Event


    Client Page            Server Page                   Entry                Exit
    Impression              Impression                   Event               Event
       Event                  Event


                                                                                             6
ATTRIBUTION MODEL

         Channel           Impression       Click   Open
                       Client      Server
Direct                   ✓          ✓
Organic Search           ✓
Paid Search                                  ✓
Display Offers                      ✓        ✓
Onsite Offers                       ✓        ✓

Transactional Emails                         ✓       ✓
Marketing Emails                             ✓       ✓




                                                           7
LOGICAL ARCHITECTURE
             Onsite Channels                                          Marketing Channels

                        Mobile                Search                                                    Display
  Web Tracking                                                 Social       Email          Onsite
                   Instrumentation            Engine                                                   Advertising
      JS                                                      Marketing    Marketing      Marketing
                         API                 Marketing
                                            Instrumentation


                               Tracking            Tracking                            Message Delivery Services
      Metadata                 Servers             Service
        Tool                                                                                          Marketing
                                                                                  Segmentation
                                                  Active MQ                                            Offers
                                                                                    Service
                                                  Producer                                             Service

      Tracking
      Metadata                                    Active
      Service                                      MQ                                       Hadoop Cluster



                               Tracking    Active MQ           Active MQ
                               Collector   Consumer            Consumer
                                                                                    Customer          Operational
                                                                                   Intelligence        Metrics
 Metadata      Tag
Repository    Catalog                      NAS Filer          NAS Filer             Behavioral
                                                                                   Intelligence       Reporting

                                                 Aggregation/                     Sessionization       Identity
                                Tracking
                                                 Compression                       Bot Flagging        Mapping
                                Batch
                                                                                                          8
DATA INGEST PIPELINE

                   Raw Event
  PRE-PROCESS




                                    Map Reduce                                    Map Reduce
                    Gzip Text
                                                               Deduped                              Enriched
                                      Validate/                 Event             Join Client &      Event
                                    Dedup Events               Gzip block         Server Events     Gzip block
                                                              compressed                           compressed
                   Raw Event
                                                             SequenceFile                         SequenceFile
                    Gzip Text


                                                             CHAIN REDUCER
  SESSIONIZATION




                                     Map Reduce                 Mapper              Mapper

                    Enriched         Sessionization             Geo Lookup         Bot Flagging     Sessions
                     Event


                                                                   Geo               Bot Data/
                                                                   Data               Rules

                                  Map Reduce       Map Reduce

                                                                     Behavioral       Reporting
GENERATION




                    Sessions        Stage 1           Stage 2
                                                                      Metrics          MySQL
METRICS




                                                       Pig
                    Enriched
                     Event                         Adhoc Metrics
SESSIONIZATION
                       Events                                                        VisitContainer
Visitor      Session     Timestamp        Event                Visitor     Session                 Payload
ID           ID                           Payload              ID          ID

                                                                  V1          S1      ie, winnt, {flash, quicktime},
   V1            S1      2012-05-24           E1
                                                                                      {ca, usa}, 480 secs,….
                         05:12
                                                                                                     E1
   V2            S2      2012-05-24           E2
                         05:14                                                                       E3
   V1            S1      2012-05-24           E3                                                     E4
                         05:15
                                                                  V2          S2      ff, winxp, {acrobat,
   V1            S1      2012-05-24           E4                                      mediaplayer}. {wb, in}, 420
                         05:20                                                        secs…..
   V2            S2      2012-05-24           E5                                                     E2
                         05:21
                                                                                                     E5
   V1            S3      2012-05-24           E6
                         07:25                                    V1          S3      sf, macos, {quicktime, java},
                                                                                      {on, ca}, 60 secs
   V1            S3      2012-05-24           E7
                         07:26                                                                       E6
                                                                                                     E7
•  Chronologically sort events using secondary sort
        •  SortComparator on visitorid, sessionid and timestamp
        •    Partitioner & Grouping comparator on visitorid and sessionid
•  Normalize data and store it against the session record                                                              10
        •    Browser, os, plugins, geo-location, duration, bot-flag etc.
DIMENSIONS & METRICS

    Dimension          Metrics
  Page            Visitors
  PageFlow        Sessions
  Country         Bounce Rate
  CountryRegion   Page Views
  Plugins
  VisitDepth
  VisitDuration     Time Period
  VisitByHour     Hourly
  SearchEngine    Daily
  OS              Weekly
  Browser         Monthly


                                  11
METRICS GENERATION
          Mapper Input                   Mapper Output
                                                                             Reducer Output
Visitor     Session       Browser      Key           Value
  ID          ID                    (visitorid,   (#sessions)                Key           Value
                                    browser)                              (visitorid,   (#sessions)
                                                                          browser)                    Compute
  V1          S1            IE        V1,IE            1
                                                                            V1,IE           2
                                                                                                      sessions sorted
  V1          S2            IE        V1,IE            1                                              by visitor,
                                                                            V2,FF           1         dimension
  V2          S3            FF        V2,FF            1        STAGE 1
                                                                            V3,IE           1         (browser)
  V3          S4            IE        V3,IE            1
                                                                            V4,FF           1
  V4          S5            FF        V4,FF            1



       Mapper Input                  Mapper Output

   Key            Value             Key              Value                   Reducer Output           Compute
(visitorid,    (#sessions)       (browser)        (#sessions,                                         metrics
browser)                                           #visitors)                Key           Value
                                                                          (browser)     (#sessions,
                                                                                                      by
                                                                                         #visitors)   dimension
  V1,IE               2             IE                2,1
                                                                              IE            4,3
  V2,FF               1             IE                1,1
                                                                STAGE 2       FF            1,1
  V3,IE               1             FF                1,1

  V4,FF               1             IE                1,1                                                         12
PIG – ADHOC QUERIES
/* EventLoader - custom loader ; Exposes correct data-types using metadata for each field*/

grunt> data = LOAD '/paypal/event' USING
>> com.paypal.EventLoader(
>> 'visitor_id, session_id, page_name, event_type, event_timestamp');

grunt> describe data;
data: {visitor_id: chararray, session_id: chararray, page_name: chararray,
event_type: chararray, event_timestamp: long }

grunt> events = FILTER data BY event_timestamp >= 1337583600000L and
event_timestamp < 1337587200000L;

grunt> grouped = group events by (page_name, event_type) parallel 20;
grunt> result = foreach grouped {
>>      visitors = distinct events.visitor_id;
>>      sessions = distinct events.session_id;
>>      generate group, COUNT(visitors), COUNT(sessions), COUNT(events);
>> };

grunt> dump result;
((My Account Overview, im), 117875L,119343L,230216L)
((mktg:xsell:merchant::home-inside, im), 462L,466L,655L)                                      13
PIG – ADHOC QUERIES
/* VisitContainerLoader custom loader - Tuple ( Tuple, Bag (Tuple) )*/

grunt> data = LOAD '/paypal/visitcontainer'
>> USING com.paypal.VisitContainerLoader(
>> '{"visit":["visitor_id",”session_id","session_start", "session_end", "browser_type"],
"events":["page_name", "event_type"]}');

grunt> describe data;
data: {visit: (visitor_id: chararray, session_id: chararray, session_start: long, session_end:
long, browser_type: chararray),
        events: {event: (page_name: chararray, event_type: chararray)}}

grunt> flattened = foreach data generate FLATTEN(visit), FLATTEN(events);
grunt> impression = FILTER flattened BY event_type == 'im' and session_start >=
1339045200000L and session_end < 1339063200000L;
grunt> grouped = group impression by (page_name, browser_type) parallel 20;
grunt> result = foreach grouped {
>> visitors = distinct impression.visitor_id;
>> sessions = distinct impression.session_id;
>> generate group, COUNT(visitors), COUNT(sessions), COUNT(impression);
>> };

grunt> dump result;
((Account History:Request Money Details, chrome), 522L,528L,726L)
                                                                                                 14
((Account History:Request Money Details, msie), 706L,716L,967L)
REPORTING




            15
THANK YOU


We Are Hiring!
•  San Jose
•  Boston
•  Bangalore
•  Shanghai
Sessions will resume at 4:30pm




                             Page 17

More Related Content

What's hot

Data Engineering 101
Data Engineering 101Data Engineering 101
Data Engineering 101
DaeMyung Kang
 
Api clarity webinar
Api clarity webinarApi clarity webinar
Api clarity webinar
LibbySchulze
 
Digital Banking - Industry Trends for Customer Service
Digital Banking - Industry Trends for Customer ServiceDigital Banking - Industry Trends for Customer Service
Digital Banking - Industry Trends for Customer Service
Gianluca Ferranti
 
Using Big Data to Driving Big Engagement
Using Big Data to Driving Big EngagementUsing Big Data to Driving Big Engagement
Using Big Data to Driving Big Engagement
Amazon Web Services
 
APIdays Open Banking & Fintech: Workshop - Financial Services Use Cases for APIs
APIdays Open Banking & Fintech: Workshop - Financial Services Use Cases for APIsAPIdays Open Banking & Fintech: Workshop - Financial Services Use Cases for APIs
APIdays Open Banking & Fintech: Workshop - Financial Services Use Cases for APIs
Jeremy Brown
 
S3をDB利用 ショッピングセンター向けポイントシステム概要
S3をDB利用 ショッピングセンター向けポイントシステム概要S3をDB利用 ショッピングセンター向けポイントシステム概要
S3をDB利用 ショッピングセンター向けポイントシステム概要
一成 田部井
 
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Institute of Contemporary Sciences
 
Blockchain and banking
Blockchain and bankingBlockchain and banking
Blockchain and banking
Anisha Sachit
 
Indian Wealthtech – A $60 Bn opportunity by FY25
Indian Wealthtech – A $60 Bn opportunity by FY25Indian Wealthtech – A $60 Bn opportunity by FY25
Indian Wealthtech – A $60 Bn opportunity by FY25
RedSeer
 
Digital Lending in India
Digital Lending in IndiaDigital Lending in India
Digital Lending in India
Sam Ghosh
 
LeanIX GraphQL Lessons Learned - CodeTalks 2017
LeanIX GraphQL Lessons Learned - CodeTalks 2017LeanIX GraphQL Lessons Learned - CodeTalks 2017
LeanIX GraphQL Lessons Learned - CodeTalks 2017
LeanIX GmbH
 
The Journey to Digital Transformation with Touch Bank
The Journey to Digital Transformation with Touch BankThe Journey to Digital Transformation with Touch Bank
The Journey to Digital Transformation with Touch Bank
Backbase
 
Benchmark Slide Deck
Benchmark Slide Deck Benchmark Slide Deck
Benchmark Slide Deck
Eric Santos
 
슬라이드쉐어
슬라이드쉐어슬라이드쉐어
슬라이드쉐어
sungminlee
 
Banking as a Service (download)
Banking as a Service (download)Banking as a Service (download)
Banking as a Service (download)
Chris Skinner
 
Going Digital: The Banking Transformation Road Map
Going Digital: The Banking Transformation Road MapGoing Digital: The Banking Transformation Road Map
Going Digital: The Banking Transformation Road Map
Semalytix
 
Fintech Vietnam Startups Report
Fintech Vietnam Startups ReportFintech Vietnam Startups Report
Fintech Vietnam Startups Report
Christian König
 
서비스 기획자를 위한 데이터분석 시작하기
서비스 기획자를 위한 데이터분석 시작하기서비스 기획자를 위한 데이터분석 시작하기
서비스 기획자를 위한 데이터분석 시작하기
승화 양
 
Webcast: Deep-Dive Apigee Edge Microgateway
Webcast: Deep-Dive Apigee Edge MicrogatewayWebcast: Deep-Dive Apigee Edge Microgateway
Webcast: Deep-Dive Apigee Edge Microgateway
Apigee | Google Cloud
 
Digital redefinition of banking banking transformation
Digital redefinition of banking   banking transformationDigital redefinition of banking   banking transformation
Digital redefinition of banking banking transformation
Draup
 

What's hot (20)

Data Engineering 101
Data Engineering 101Data Engineering 101
Data Engineering 101
 
Api clarity webinar
Api clarity webinarApi clarity webinar
Api clarity webinar
 
Digital Banking - Industry Trends for Customer Service
Digital Banking - Industry Trends for Customer ServiceDigital Banking - Industry Trends for Customer Service
Digital Banking - Industry Trends for Customer Service
 
Using Big Data to Driving Big Engagement
Using Big Data to Driving Big EngagementUsing Big Data to Driving Big Engagement
Using Big Data to Driving Big Engagement
 
APIdays Open Banking & Fintech: Workshop - Financial Services Use Cases for APIs
APIdays Open Banking & Fintech: Workshop - Financial Services Use Cases for APIsAPIdays Open Banking & Fintech: Workshop - Financial Services Use Cases for APIs
APIdays Open Banking & Fintech: Workshop - Financial Services Use Cases for APIs
 
S3をDB利用 ショッピングセンター向けポイントシステム概要
S3をDB利用 ショッピングセンター向けポイントシステム概要S3をDB利用 ショッピングセンター向けポイントシステム概要
S3をDB利用 ショッピングセンター向けポイントシステム概要
 
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
 
Blockchain and banking
Blockchain and bankingBlockchain and banking
Blockchain and banking
 
Indian Wealthtech – A $60 Bn opportunity by FY25
Indian Wealthtech – A $60 Bn opportunity by FY25Indian Wealthtech – A $60 Bn opportunity by FY25
Indian Wealthtech – A $60 Bn opportunity by FY25
 
Digital Lending in India
Digital Lending in IndiaDigital Lending in India
Digital Lending in India
 
LeanIX GraphQL Lessons Learned - CodeTalks 2017
LeanIX GraphQL Lessons Learned - CodeTalks 2017LeanIX GraphQL Lessons Learned - CodeTalks 2017
LeanIX GraphQL Lessons Learned - CodeTalks 2017
 
The Journey to Digital Transformation with Touch Bank
The Journey to Digital Transformation with Touch BankThe Journey to Digital Transformation with Touch Bank
The Journey to Digital Transformation with Touch Bank
 
Benchmark Slide Deck
Benchmark Slide Deck Benchmark Slide Deck
Benchmark Slide Deck
 
슬라이드쉐어
슬라이드쉐어슬라이드쉐어
슬라이드쉐어
 
Banking as a Service (download)
Banking as a Service (download)Banking as a Service (download)
Banking as a Service (download)
 
Going Digital: The Banking Transformation Road Map
Going Digital: The Banking Transformation Road MapGoing Digital: The Banking Transformation Road Map
Going Digital: The Banking Transformation Road Map
 
Fintech Vietnam Startups Report
Fintech Vietnam Startups ReportFintech Vietnam Startups Report
Fintech Vietnam Startups Report
 
서비스 기획자를 위한 데이터분석 시작하기
서비스 기획자를 위한 데이터분석 시작하기서비스 기획자를 위한 데이터분석 시작하기
서비스 기획자를 위한 데이터분석 시작하기
 
Webcast: Deep-Dive Apigee Edge Microgateway
Webcast: Deep-Dive Apigee Edge MicrogatewayWebcast: Deep-Dive Apigee Edge Microgateway
Webcast: Deep-Dive Apigee Edge Microgateway
 
Digital redefinition of banking banking transformation
Digital redefinition of banking   banking transformationDigital redefinition of banking   banking transformation
Digital redefinition of banking banking transformation
 

Viewers also liked

Big data – can it deliver speed and accuracy v1
Big data – can it deliver speed and accuracy v1Big data – can it deliver speed and accuracy v1
Big data – can it deliver speed and accuracy v1GurinderG
 
Big Data: It's More Than Volume, Paypal
Big Data: It's More Than Volume, PaypalBig Data: It's More Than Volume, Paypal
Big Data: It's More Than Volume, Paypal
Innovation Enterprise
 
EAP - Accelerating behavorial analytics at PayPal using Hadoop
EAP - Accelerating behavorial analytics at PayPal using HadoopEAP - Accelerating behavorial analytics at PayPal using Hadoop
EAP - Accelerating behavorial analytics at PayPal using Hadoop
DataWorks Summit
 
Big- Data and Risk Management - Ido Lustig, PayPal
Big- Data and Risk Management - Ido Lustig, PayPalBig- Data and Risk Management - Ido Lustig, PayPal
Big- Data and Risk Management - Ido Lustig, PayPal
Codemotion Tel Aviv
 
Importance of connecting CRM with ERP
Importance of connecting CRM with ERPImportance of connecting CRM with ERP
Importance of connecting CRM with ERP
APPSeCONNECT
 
Paypal Platform: Evolving for simplicity and reach - IBM Silicon Valley Lab
Paypal Platform: Evolving for simplicity and reach - IBM Silicon Valley LabPaypal Platform: Evolving for simplicity and reach - IBM Silicon Valley Lab
Paypal Platform: Evolving for simplicity and reach - IBM Silicon Valley LabDeepak Nadig
 
H2O World - Data Science w/ Big Data in a Corporate Environment - Nachum Shacham
H2O World - Data Science w/ Big Data in a Corporate Environment - Nachum ShachamH2O World - Data Science w/ Big Data in a Corporate Environment - Nachum Shacham
H2O World - Data Science w/ Big Data in a Corporate Environment - Nachum Shacham
Sri Ambati
 
Cloud Integration Services on SAP HANA Cloud Platform
Cloud Integration Services on SAP HANA Cloud PlatformCloud Integration Services on SAP HANA Cloud Platform
Cloud Integration Services on SAP HANA Cloud Platform
Michael Hill
 
Innovating to Real-Time using SAP BusinessObjects & SAP HANA
Innovating to Real-Time using SAP BusinessObjects & SAP HANAInnovating to Real-Time using SAP BusinessObjects & SAP HANA
Innovating to Real-Time using SAP BusinessObjects & SAP HANA
Kurt J. Bilafer
 
Self-service BI for SAP and HANA – Dream or Reality?
Self-service BI for SAP and HANA – Dream or Reality?Self-service BI for SAP and HANA – Dream or Reality?
Self-service BI for SAP and HANA – Dream or Reality?
Ocean9, Inc.
 
Baan
BaanBaan
PayPal Real Time Analytics
PayPal  Real Time AnalyticsPayPal  Real Time Analytics
PayPal Real Time Analytics
Anil Madan
 
Cio forum s4hana
Cio forum s4hanaCio forum s4hana
Cio forum s4hana
Ajay Kumar Uppal
 
SAP C4C overview
SAP C4C overviewSAP C4C overview
SAP C4C overview
Ripunjay Rathaur
 
PayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL ClusterPayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL Cluster
Mat Keep
 
Hadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache GiraphHadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache Giraph
DataWorks Summit
 
Big data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT ArchitectureBig data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT Architecture
Yves Caseau
 
Software Consultancy (CRM-ERP-EPM-SCM-SOCIAL CLOUD)
Software Consultancy (CRM-ERP-EPM-SCM-SOCIAL CLOUD)Software Consultancy (CRM-ERP-EPM-SCM-SOCIAL CLOUD)
Software Consultancy (CRM-ERP-EPM-SCM-SOCIAL CLOUD)
Ibrahim Younis
 
eCommerce and ePayments markets in Russia : trends , analytics , perspect...
eCommerce and  ePayments markets in  Russia :  trends ,  analytics , perspect...eCommerce and  ePayments markets in  Russia :  trends ,  analytics , perspect...
eCommerce and ePayments markets in Russia : trends , analytics , perspect...
Data Insight
 
Sap hybris overview
Sap hybris overviewSap hybris overview
Sap hybris overview
krishna arjun
 

Viewers also liked (20)

Big data – can it deliver speed and accuracy v1
Big data – can it deliver speed and accuracy v1Big data – can it deliver speed and accuracy v1
Big data – can it deliver speed and accuracy v1
 
Big Data: It's More Than Volume, Paypal
Big Data: It's More Than Volume, PaypalBig Data: It's More Than Volume, Paypal
Big Data: It's More Than Volume, Paypal
 
EAP - Accelerating behavorial analytics at PayPal using Hadoop
EAP - Accelerating behavorial analytics at PayPal using HadoopEAP - Accelerating behavorial analytics at PayPal using Hadoop
EAP - Accelerating behavorial analytics at PayPal using Hadoop
 
Big- Data and Risk Management - Ido Lustig, PayPal
Big- Data and Risk Management - Ido Lustig, PayPalBig- Data and Risk Management - Ido Lustig, PayPal
Big- Data and Risk Management - Ido Lustig, PayPal
 
Importance of connecting CRM with ERP
Importance of connecting CRM with ERPImportance of connecting CRM with ERP
Importance of connecting CRM with ERP
 
Paypal Platform: Evolving for simplicity and reach - IBM Silicon Valley Lab
Paypal Platform: Evolving for simplicity and reach - IBM Silicon Valley LabPaypal Platform: Evolving for simplicity and reach - IBM Silicon Valley Lab
Paypal Platform: Evolving for simplicity and reach - IBM Silicon Valley Lab
 
H2O World - Data Science w/ Big Data in a Corporate Environment - Nachum Shacham
H2O World - Data Science w/ Big Data in a Corporate Environment - Nachum ShachamH2O World - Data Science w/ Big Data in a Corporate Environment - Nachum Shacham
H2O World - Data Science w/ Big Data in a Corporate Environment - Nachum Shacham
 
Cloud Integration Services on SAP HANA Cloud Platform
Cloud Integration Services on SAP HANA Cloud PlatformCloud Integration Services on SAP HANA Cloud Platform
Cloud Integration Services on SAP HANA Cloud Platform
 
Innovating to Real-Time using SAP BusinessObjects & SAP HANA
Innovating to Real-Time using SAP BusinessObjects & SAP HANAInnovating to Real-Time using SAP BusinessObjects & SAP HANA
Innovating to Real-Time using SAP BusinessObjects & SAP HANA
 
Self-service BI for SAP and HANA – Dream or Reality?
Self-service BI for SAP and HANA – Dream or Reality?Self-service BI for SAP and HANA – Dream or Reality?
Self-service BI for SAP and HANA – Dream or Reality?
 
Baan
BaanBaan
Baan
 
PayPal Real Time Analytics
PayPal  Real Time AnalyticsPayPal  Real Time Analytics
PayPal Real Time Analytics
 
Cio forum s4hana
Cio forum s4hanaCio forum s4hana
Cio forum s4hana
 
SAP C4C overview
SAP C4C overviewSAP C4C overview
SAP C4C overview
 
PayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL ClusterPayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL Cluster
 
Hadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache GiraphHadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache Giraph
 
Big data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT ArchitectureBig data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT Architecture
 
Software Consultancy (CRM-ERP-EPM-SCM-SOCIAL CLOUD)
Software Consultancy (CRM-ERP-EPM-SCM-SOCIAL CLOUD)Software Consultancy (CRM-ERP-EPM-SCM-SOCIAL CLOUD)
Software Consultancy (CRM-ERP-EPM-SCM-SOCIAL CLOUD)
 
eCommerce and ePayments markets in Russia : trends , analytics , perspect...
eCommerce and  ePayments markets in  Russia :  trends ,  analytics , perspect...eCommerce and  ePayments markets in  Russia :  trends ,  analytics , perspect...
eCommerce and ePayments markets in Russia : trends , analytics , perspect...
 
Sap hybris overview
Sap hybris overviewSap hybris overview
Sap hybris overview
 

Similar to PayPal Behavioral Analytics on Hadoop

How Hansa Cequity can help you enrich your Customer Equity?
How Hansa Cequity can help you enrich your Customer Equity?How Hansa Cequity can help you enrich your Customer Equity?
How Hansa Cequity can help you enrich your Customer Equity?
Ajay Kelkar
 
DPS: Operative Spotlight on the Changing Face of Digital Publishing Operations
DPS: Operative Spotlight on the Changing Face of Digital Publishing OperationsDPS: Operative Spotlight on the Changing Face of Digital Publishing Operations
DPS: Operative Spotlight on the Changing Face of Digital Publishing Operations
Digiday
 
Ad ecosystem-slides
Ad ecosystem-slidesAd ecosystem-slides
Ad ecosystem-slidesEric Picard
 
Evolving analytics at ebay - 2012 Tableau Customer Conference
Evolving analytics at ebay - 2012 Tableau Customer ConferenceEvolving analytics at ebay - 2012 Tableau Customer Conference
Evolving analytics at ebay - 2012 Tableau Customer Conference
gdougan1
 
Microsoft Media Platform Overview
Microsoft Media Platform OverviewMicrosoft Media Platform Overview
Microsoft Media Platform Overview
David Chou
 
Cloud Service Providers and OpenStack
Cloud Service Providers and OpenStackCloud Service Providers and OpenStack
Cloud Service Providers and OpenStackOpen Stack
 
1. sugarcrm social crm editions comparison 2011
1. sugarcrm social crm editions comparison 20111. sugarcrm social crm editions comparison 2011
1. sugarcrm social crm editions comparison 2011Friedel Jonker
 
Innerworkings Pitch - Think Small to Get Big 3-4-13
Innerworkings Pitch - Think Small to Get Big 3-4-13Innerworkings Pitch - Think Small to Get Big 3-4-13
Innerworkings Pitch - Think Small to Get Big 3-4-13PrestonPate
 
Java micro-services
Java micro-servicesJava micro-services
Java micro-services
James Lewis
 
The power of digital CRM
The power of digital CRMThe power of digital CRM
The power of digital CRM
Customer Centria
 
3 forrester - tag management state of the union
3   forrester - tag management state of the union3   forrester - tag management state of the union
3 forrester - tag management state of the unionEnsighten
 
GRS Market Research
GRS Market ResearchGRS Market Research
GRS Market Research
Munish Kumar
 
About Our Recommender System
About Our Recommender SystemAbout Our Recommender System
About Our Recommender System
Kimikazu Kato
 
Mastering the customer engagement ecosystem with CQ5
Mastering the customer engagement ecosystem with CQ5Mastering the customer engagement ecosystem with CQ5
Mastering the customer engagement ecosystem with CQ5Lars Trieloff
 
Introduction Force.com-Platform / Salesforce.com
Introduction Force.com-Platform / Salesforce.comIntroduction Force.com-Platform / Salesforce.com
Introduction Force.com-Platform / Salesforce.comAptly GmbH
 
Online Business : Optimizing your sales channels, AT INTERNET eCommretail Li...
Online Business : Optimizing  your sales channels, AT INTERNET eCommretail Li...Online Business : Optimizing  your sales channels, AT INTERNET eCommretail Li...
Online Business : Optimizing your sales channels, AT INTERNET eCommretail Li...AT Internet
 
Testing solutions for internet industry.
Testing solutions for internet industry.Testing solutions for internet industry.
Testing solutions for internet industry.
Mindtree Ltd.
 
The Digital Intelligence Imperative — Driving Digital Customer Experiences W...
 The Digital Intelligence Imperative — Driving Digital Customer Experiences W... The Digital Intelligence Imperative — Driving Digital Customer Experiences W...
The Digital Intelligence Imperative — Driving Digital Customer Experiences W...
Tealium
 
Admonsters OPS Mobile Keynote Presentation
Admonsters OPS Mobile Keynote PresentationAdmonsters OPS Mobile Keynote Presentation
Admonsters OPS Mobile Keynote Presentation
Paul Gelb
 
Dm arts d1-workshop-steffen ehrhardt-google-innovations in display
Dm arts d1-workshop-steffen ehrhardt-google-innovations in displayDm arts d1-workshop-steffen ehrhardt-google-innovations in display
Dm arts d1-workshop-steffen ehrhardt-google-innovations in displayDigital Marketing Arts
 

Similar to PayPal Behavioral Analytics on Hadoop (20)

How Hansa Cequity can help you enrich your Customer Equity?
How Hansa Cequity can help you enrich your Customer Equity?How Hansa Cequity can help you enrich your Customer Equity?
How Hansa Cequity can help you enrich your Customer Equity?
 
DPS: Operative Spotlight on the Changing Face of Digital Publishing Operations
DPS: Operative Spotlight on the Changing Face of Digital Publishing OperationsDPS: Operative Spotlight on the Changing Face of Digital Publishing Operations
DPS: Operative Spotlight on the Changing Face of Digital Publishing Operations
 
Ad ecosystem-slides
Ad ecosystem-slidesAd ecosystem-slides
Ad ecosystem-slides
 
Evolving analytics at ebay - 2012 Tableau Customer Conference
Evolving analytics at ebay - 2012 Tableau Customer ConferenceEvolving analytics at ebay - 2012 Tableau Customer Conference
Evolving analytics at ebay - 2012 Tableau Customer Conference
 
Microsoft Media Platform Overview
Microsoft Media Platform OverviewMicrosoft Media Platform Overview
Microsoft Media Platform Overview
 
Cloud Service Providers and OpenStack
Cloud Service Providers and OpenStackCloud Service Providers and OpenStack
Cloud Service Providers and OpenStack
 
1. sugarcrm social crm editions comparison 2011
1. sugarcrm social crm editions comparison 20111. sugarcrm social crm editions comparison 2011
1. sugarcrm social crm editions comparison 2011
 
Innerworkings Pitch - Think Small to Get Big 3-4-13
Innerworkings Pitch - Think Small to Get Big 3-4-13Innerworkings Pitch - Think Small to Get Big 3-4-13
Innerworkings Pitch - Think Small to Get Big 3-4-13
 
Java micro-services
Java micro-servicesJava micro-services
Java micro-services
 
The power of digital CRM
The power of digital CRMThe power of digital CRM
The power of digital CRM
 
3 forrester - tag management state of the union
3   forrester - tag management state of the union3   forrester - tag management state of the union
3 forrester - tag management state of the union
 
GRS Market Research
GRS Market ResearchGRS Market Research
GRS Market Research
 
About Our Recommender System
About Our Recommender SystemAbout Our Recommender System
About Our Recommender System
 
Mastering the customer engagement ecosystem with CQ5
Mastering the customer engagement ecosystem with CQ5Mastering the customer engagement ecosystem with CQ5
Mastering the customer engagement ecosystem with CQ5
 
Introduction Force.com-Platform / Salesforce.com
Introduction Force.com-Platform / Salesforce.comIntroduction Force.com-Platform / Salesforce.com
Introduction Force.com-Platform / Salesforce.com
 
Online Business : Optimizing your sales channels, AT INTERNET eCommretail Li...
Online Business : Optimizing  your sales channels, AT INTERNET eCommretail Li...Online Business : Optimizing  your sales channels, AT INTERNET eCommretail Li...
Online Business : Optimizing your sales channels, AT INTERNET eCommretail Li...
 
Testing solutions for internet industry.
Testing solutions for internet industry.Testing solutions for internet industry.
Testing solutions for internet industry.
 
The Digital Intelligence Imperative — Driving Digital Customer Experiences W...
 The Digital Intelligence Imperative — Driving Digital Customer Experiences W... The Digital Intelligence Imperative — Driving Digital Customer Experiences W...
The Digital Intelligence Imperative — Driving Digital Customer Experiences W...
 
Admonsters OPS Mobile Keynote Presentation
Admonsters OPS Mobile Keynote PresentationAdmonsters OPS Mobile Keynote Presentation
Admonsters OPS Mobile Keynote Presentation
 
Dm arts d1-workshop-steffen ehrhardt-google-innovations in display
Dm arts d1-workshop-steffen ehrhardt-google-innovations in displayDm arts d1-workshop-steffen ehrhardt-google-innovations in display
Dm arts d1-workshop-steffen ehrhardt-google-innovations in display
 

More from DataWorks Summit

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 

Recently uploaded (20)

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 

PayPal Behavioral Analytics on Hadoop

  • 1. PAYPAL - BEHAVIORAL TRACKING ON HADOOP ANIL MADAN DIRECTOR OF ENGINEERING , MARKETING & ANALYTICS
  • 2. PAYPAL'S VISION Delivering the future of money today… An essential part of our customer's financial and business lives, enabling secure commerce anywhere, anytime, any way 110 million active accounts , 190 markets , 25 currencies 2
  • 3. BEHAVIORAL TRACKING VISION Understand our anytime, anywhere, any way to drive desirable customer’s behavior outcomes for our and experience customers and for PayPal. Enable self-service analytics for our product and Ensure marketing teams Ensure privacy, instrumentation security and trust standardization 3 for our customers across channels 3
  • 4. TRACKING PLATFORM OVERVIEW Direct/ Transaction Email Display Search Home Page Emails Marketing Advertising Engine Marketing Metadata Tracking Servers Real Time Systems Tracking Metadata Marketing Tracking Event Service Tool Segmentation Tracking Validation Tag Taxonomy Service Experimentation Catalog Big Data Reporting/Visualization Digital Metrics Attribution 4
  • 5. METADATA - ENTITY MODEL LAYOUT PAGE ELEMENTS LINK COMPONENTS 5
  • 6. METADATA - EVENT MODEL Tracking Event Impression Reaction Conversion Event Event Event Component Page Ad Click Click-Through Mouse-over Impression Impression Impression Event Event Event Event Event Event Client Page Server Page Entry Exit Impression Impression Event Event Event Event 6
  • 7. ATTRIBUTION MODEL Channel Impression Click Open Client Server Direct ✓ ✓ Organic Search ✓ Paid Search ✓ Display Offers ✓ ✓ Onsite Offers ✓ ✓ Transactional Emails ✓ ✓ Marketing Emails ✓ ✓ 7
  • 8. LOGICAL ARCHITECTURE Onsite Channels Marketing Channels Mobile Search Display Web Tracking Social Email Onsite Instrumentation Engine Advertising JS Marketing Marketing Marketing API Marketing Instrumentation Tracking Tracking Message Delivery Services Metadata Servers Service Tool Marketing Segmentation Active MQ Offers Service Producer Service Tracking Metadata Active Service MQ Hadoop Cluster Tracking Active MQ Active MQ Collector Consumer Consumer Customer Operational Intelligence Metrics Metadata Tag Repository Catalog NAS Filer NAS Filer Behavioral Intelligence Reporting Aggregation/ Sessionization Identity Tracking Compression Bot Flagging Mapping Batch 8
  • 9. DATA INGEST PIPELINE Raw Event PRE-PROCESS Map Reduce Map Reduce Gzip Text Deduped Enriched Validate/ Event Join Client & Event Dedup Events Gzip block Server Events Gzip block compressed compressed Raw Event SequenceFile SequenceFile Gzip Text CHAIN REDUCER SESSIONIZATION Map Reduce Mapper Mapper Enriched Sessionization Geo Lookup Bot Flagging Sessions Event Geo Bot Data/ Data Rules Map Reduce Map Reduce Behavioral Reporting GENERATION Sessions Stage 1 Stage 2 Metrics MySQL METRICS Pig Enriched Event Adhoc Metrics
  • 10. SESSIONIZATION Events VisitContainer Visitor Session Timestamp Event Visitor Session Payload ID ID Payload ID ID V1 S1 ie, winnt, {flash, quicktime}, V1 S1 2012-05-24 E1 {ca, usa}, 480 secs,…. 05:12 E1 V2 S2 2012-05-24 E2 05:14 E3 V1 S1 2012-05-24 E3 E4 05:15 V2 S2 ff, winxp, {acrobat, V1 S1 2012-05-24 E4 mediaplayer}. {wb, in}, 420 05:20 secs….. V2 S2 2012-05-24 E5 E2 05:21 E5 V1 S3 2012-05-24 E6 07:25 V1 S3 sf, macos, {quicktime, java}, {on, ca}, 60 secs V1 S3 2012-05-24 E7 07:26 E6 E7 •  Chronologically sort events using secondary sort •  SortComparator on visitorid, sessionid and timestamp •  Partitioner & Grouping comparator on visitorid and sessionid •  Normalize data and store it against the session record 10 •  Browser, os, plugins, geo-location, duration, bot-flag etc.
  • 11. DIMENSIONS & METRICS Dimension Metrics Page Visitors PageFlow Sessions Country Bounce Rate CountryRegion Page Views Plugins VisitDepth VisitDuration Time Period VisitByHour Hourly SearchEngine Daily OS Weekly Browser Monthly 11
  • 12. METRICS GENERATION Mapper Input Mapper Output Reducer Output Visitor Session Browser Key Value ID ID (visitorid, (#sessions) Key Value browser) (visitorid, (#sessions) browser) Compute V1 S1 IE V1,IE 1 V1,IE 2 sessions sorted V1 S2 IE V1,IE 1 by visitor, V2,FF 1 dimension V2 S3 FF V2,FF 1 STAGE 1 V3,IE 1 (browser) V3 S4 IE V3,IE 1 V4,FF 1 V4 S5 FF V4,FF 1 Mapper Input Mapper Output Key Value Key Value Reducer Output Compute (visitorid, (#sessions) (browser) (#sessions, metrics browser) #visitors) Key Value (browser) (#sessions, by #visitors) dimension V1,IE 2 IE 2,1 IE 4,3 V2,FF 1 IE 1,1 STAGE 2 FF 1,1 V3,IE 1 FF 1,1 V4,FF 1 IE 1,1 12
  • 13. PIG – ADHOC QUERIES /* EventLoader - custom loader ; Exposes correct data-types using metadata for each field*/ grunt> data = LOAD '/paypal/event' USING >> com.paypal.EventLoader( >> 'visitor_id, session_id, page_name, event_type, event_timestamp'); grunt> describe data; data: {visitor_id: chararray, session_id: chararray, page_name: chararray, event_type: chararray, event_timestamp: long } grunt> events = FILTER data BY event_timestamp >= 1337583600000L and event_timestamp < 1337587200000L; grunt> grouped = group events by (page_name, event_type) parallel 20; grunt> result = foreach grouped { >> visitors = distinct events.visitor_id; >> sessions = distinct events.session_id; >> generate group, COUNT(visitors), COUNT(sessions), COUNT(events); >> }; grunt> dump result; ((My Account Overview, im), 117875L,119343L,230216L) ((mktg:xsell:merchant::home-inside, im), 462L,466L,655L) 13
  • 14. PIG – ADHOC QUERIES /* VisitContainerLoader custom loader - Tuple ( Tuple, Bag (Tuple) )*/ grunt> data = LOAD '/paypal/visitcontainer' >> USING com.paypal.VisitContainerLoader( >> '{"visit":["visitor_id",”session_id","session_start", "session_end", "browser_type"], "events":["page_name", "event_type"]}'); grunt> describe data; data: {visit: (visitor_id: chararray, session_id: chararray, session_start: long, session_end: long, browser_type: chararray), events: {event: (page_name: chararray, event_type: chararray)}} grunt> flattened = foreach data generate FLATTEN(visit), FLATTEN(events); grunt> impression = FILTER flattened BY event_type == 'im' and session_start >= 1339045200000L and session_end < 1339063200000L; grunt> grouped = group impression by (page_name, browser_type) parallel 20; grunt> result = foreach grouped { >> visitors = distinct impression.visitor_id; >> sessions = distinct impression.session_id; >> generate group, COUNT(visitors), COUNT(sessions), COUNT(impression); >> }; grunt> dump result; ((Account History:Request Money Details, chrome), 522L,528L,726L) 14 ((Account History:Request Money Details, msie), 706L,716L,967L)
  • 15. REPORTING 15
  • 16. THANK YOU We Are Hiring! •  San Jose •  Boston •  Bangalore •  Shanghai
  • 17. Sessions will resume at 4:30pm Page 17