Submit Search
Upload
Audience Intel presentation 2014
•
Download as PPTX, PDF
•
1 like
•
457 views
David Mitchell
Follow
Audience Intel is a web service provided by IXI Services, Equifax
Read less
Read more
Software
Report
Share
Report
Share
1 of 70
Download now
Recommended
Oracle cdw loan servicer case study-final_for web
Oracle cdw loan servicer case study-final_for web
Mainstay
Virtuoso ODBC Driver Configuration & Usage (Mac OS X)
Virtuoso ODBC Driver Configuration & Usage (Mac OS X)
Kingsley Uyi Idehen
Accessing the Linked Open Data Cloud via ODBC
Accessing the Linked Open Data Cloud via ODBC
Kingsley Uyi Idehen
Exploiting Linked (Open) Data via Microsoft Access
Exploiting Linked (Open) Data via Microsoft Access
Kingsley Uyi Idehen
Radcab Short
Radcab Short
guest66d6ee5
Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern Analytics
Senturus
Create your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouse
Jeff Kelly
GERSIS INDUSTRY CASES
GERSIS INDUSTRY CASES
Sergej Markov
Recommended
Oracle cdw loan servicer case study-final_for web
Oracle cdw loan servicer case study-final_for web
Mainstay
Virtuoso ODBC Driver Configuration & Usage (Mac OS X)
Virtuoso ODBC Driver Configuration & Usage (Mac OS X)
Kingsley Uyi Idehen
Accessing the Linked Open Data Cloud via ODBC
Accessing the Linked Open Data Cloud via ODBC
Kingsley Uyi Idehen
Exploiting Linked (Open) Data via Microsoft Access
Exploiting Linked (Open) Data via Microsoft Access
Kingsley Uyi Idehen
Radcab Short
Radcab Short
guest66d6ee5
Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern Analytics
Senturus
Create your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouse
Jeff Kelly
GERSIS INDUSTRY CASES
GERSIS INDUSTRY CASES
Sergej Markov
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB
Vishal_Agarwal_webMethods_CV_2016
Vishal_Agarwal_webMethods_CV_2016
vishal agarwal
7i server app-oap-vl2
7i server app-oap-vl2
fho1962
Unlocking Engineering Observability with advanced IT analytics
Unlocking Engineering Observability with advanced IT analytics
source{d}
How We Built the Private AppExchange App (Apex, Visualforce, RWD)
How We Built the Private AppExchange App (Apex, Visualforce, RWD)
Salesforce Developers
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
ThousandEyes
EMEA Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
EMEA Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
ThousandEyes
Resume
Resume
krishna L
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
ThousandEyes
Q1FY21 Heroes - QPT Overview and Workshop.pdf
Q1FY21 Heroes - QPT Overview and Workshop.pdf
YasmineBoudhina
Yield Vision 20090331 Erb
Yield Vision 20090331 Erb
guest20ce88c
Deliver Secure SQL Access for Enterprise APIs - August 29 2017
Deliver Secure SQL Access for Enterprise APIs - August 29 2017
Nishanth Kadiyala
Mstr meetup
Mstr meetup
Bhavani Akunuri
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j
Workshop using open source software for mobile data collection workshop - a...
Workshop using open source software for mobile data collection workshop - a...
Wisconsin Land Information Association
Neo4j: What's Under the Hood
Neo4j: What's Under the Hood
Neo4j
Business Intelligence Best Practice Summit: BI Quo Vadis
Business Intelligence Best Practice Summit: BI Quo Vadis
Managility
Vinod_peddireddy
Vinod_peddireddy
vinod kumar Reddy peddireddy
Duet enterprise executive overview
Duet enterprise executive overview
Yi Guoyong
apidays LIVE Paris 2021 - EDI & API on One Integration Platform by Mir Mustha...
apidays LIVE Paris 2021 - EDI & API on One Integration Platform by Mir Mustha...
apidays
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2
More Related Content
Similar to Audience Intel presentation 2014
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB
Vishal_Agarwal_webMethods_CV_2016
Vishal_Agarwal_webMethods_CV_2016
vishal agarwal
7i server app-oap-vl2
7i server app-oap-vl2
fho1962
Unlocking Engineering Observability with advanced IT analytics
Unlocking Engineering Observability with advanced IT analytics
source{d}
How We Built the Private AppExchange App (Apex, Visualforce, RWD)
How We Built the Private AppExchange App (Apex, Visualforce, RWD)
Salesforce Developers
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
ThousandEyes
EMEA Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
EMEA Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
ThousandEyes
Resume
Resume
krishna L
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
ThousandEyes
Q1FY21 Heroes - QPT Overview and Workshop.pdf
Q1FY21 Heroes - QPT Overview and Workshop.pdf
YasmineBoudhina
Yield Vision 20090331 Erb
Yield Vision 20090331 Erb
guest20ce88c
Deliver Secure SQL Access for Enterprise APIs - August 29 2017
Deliver Secure SQL Access for Enterprise APIs - August 29 2017
Nishanth Kadiyala
Mstr meetup
Mstr meetup
Bhavani Akunuri
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j
Workshop using open source software for mobile data collection workshop - a...
Workshop using open source software for mobile data collection workshop - a...
Wisconsin Land Information Association
Neo4j: What's Under the Hood
Neo4j: What's Under the Hood
Neo4j
Business Intelligence Best Practice Summit: BI Quo Vadis
Business Intelligence Best Practice Summit: BI Quo Vadis
Managility
Vinod_peddireddy
Vinod_peddireddy
vinod kumar Reddy peddireddy
Duet enterprise executive overview
Duet enterprise executive overview
Yi Guoyong
apidays LIVE Paris 2021 - EDI & API on One Integration Platform by Mir Mustha...
apidays LIVE Paris 2021 - EDI & API on One Integration Platform by Mir Mustha...
apidays
Similar to Audience Intel presentation 2014
(20)
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
Vishal_Agarwal_webMethods_CV_2016
Vishal_Agarwal_webMethods_CV_2016
7i server app-oap-vl2
7i server app-oap-vl2
Unlocking Engineering Observability with advanced IT analytics
Unlocking Engineering Observability with advanced IT analytics
How We Built the Private AppExchange App (Apex, Visualforce, RWD)
How We Built the Private AppExchange App (Apex, Visualforce, RWD)
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
EMEA Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
EMEA Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Resume
Resume
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Q1FY21 Heroes - QPT Overview and Workshop.pdf
Q1FY21 Heroes - QPT Overview and Workshop.pdf
Yield Vision 20090331 Erb
Yield Vision 20090331 Erb
Deliver Secure SQL Access for Enterprise APIs - August 29 2017
Deliver Secure SQL Access for Enterprise APIs - August 29 2017
Mstr meetup
Mstr meetup
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Workshop using open source software for mobile data collection workshop - a...
Workshop using open source software for mobile data collection workshop - a...
Neo4j: What's Under the Hood
Neo4j: What's Under the Hood
Business Intelligence Best Practice Summit: BI Quo Vadis
Business Intelligence Best Practice Summit: BI Quo Vadis
Vinod_peddireddy
Vinod_peddireddy
Duet enterprise executive overview
Duet enterprise executive overview
apidays LIVE Paris 2021 - EDI & API on One Integration Platform by Mir Mustha...
apidays LIVE Paris 2021 - EDI & API on One Integration Platform by Mir Mustha...
Recently uploaded
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2
WSO2CON 2024 - IoT Needs CIAM: The Importance of Centralized IAM in a Growing...
WSO2CON 2024 - IoT Needs CIAM: The Importance of Centralized IAM in a Growing...
WSO2
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
ryanfarris8
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
Juha-Pekka Tolvanen
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
WSO2
WSO2Con2024 - Organization Management: The Revolution in B2B CIAM
WSO2Con2024 - Organization Management: The Revolution in B2B CIAM
WSO2
WSO2CON 2024 - Building a Digital Government in Uganda
WSO2CON 2024 - Building a Digital Government in Uganda
WSO2
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
Shane Coughlan
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2
WSO2Con2024 - Low-Code Integration Tooling
WSO2Con2024 - Low-Code Integration Tooling
WSO2
WSO2CON 2024 - Designing Event-Driven Enterprises: Stories of Transformation
WSO2CON 2024 - Designing Event-Driven Enterprises: Stories of Transformation
WSO2
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2
WSO2Con2024 - Unleashing the Financial Potential of 13 Million People
WSO2Con2024 - Unleashing the Financial Potential of 13 Million People
WSO2
WSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
WSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
WSO2
WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...
WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...
WSO2
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2
Recently uploaded
(20)
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 - IoT Needs CIAM: The Importance of Centralized IAM in a Growing...
WSO2CON 2024 - IoT Needs CIAM: The Importance of Centralized IAM in a Growing...
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
WSO2Con2024 - Organization Management: The Revolution in B2B CIAM
WSO2Con2024 - Organization Management: The Revolution in B2B CIAM
WSO2CON 2024 - Building a Digital Government in Uganda
WSO2CON 2024 - Building a Digital Government in Uganda
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2Con2024 - Low-Code Integration Tooling
WSO2Con2024 - Low-Code Integration Tooling
WSO2CON 2024 - Designing Event-Driven Enterprises: Stories of Transformation
WSO2CON 2024 - Designing Event-Driven Enterprises: Stories of Transformation
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2Con2024 - Unleashing the Financial Potential of 13 Million People
WSO2Con2024 - Unleashing the Financial Potential of 13 Million People
WSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
WSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...
WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - Kanchana
Audience Intel presentation 2014
1.
© 2012 Equifax
Inc. Audience Intel J. David Mitchell IXI Services, Equifax Inc. 7927 Jones Branch Drive, Suite 400 | McLean, VA 22102
2.
Overview or Outline
1. IXI Audience Intel application: High-level overview 1. A graph data store: Neo4J 2. A key-value data store: Redis 3. How we query and filter Redis 2. Using ZeroMQ to build a compute cluster 1. Design patterns 2. How it works © 2012 Equifax Inc. 2
3.
The Business Need
Audience Intel © 2012 Equifax Inc. 3
4.
Background Audience Intel
(AI) will help our customers monitor and glean insights from their online marketing campaigns. AI will track click-and-conversion counts for our customers. AI will profile the click-and-conversion counts using IXI segments. WealthComplete Total Investable Assets WealthComplete Deposits Financial Cohorts Income360 Discretionary Spending Ability to Pay Economic Spectrum Aggregated FICO scores © 2012 Equifax Inc. 4
5.
Background: Use cases
The end-user application will query a persistent data store to answer their business questions. Mr. Jones from Razorfish would like to see clicks for all offers, all publishers and all creatives for June 1. Mr. Jones from Razorfish would like to see ATP (Ability to Pay) for clicks for all offers, all publishers and all creatives for June 1. Mr. Jones from Razorfish would like to see ATP for clicks for offer 1, goal 1, all publishers and creative 79 for June 1. A partner will place an IXI empty gif on their page (or in their ad), so that we will get an entry in our Web server logs. GET /digi/23CE7C3A-FAE93B9DB863/a.gif?partner=0244&offer=1&goal=1&result=1&source=1&creativeid=1 Parsers will parse the log files for a time slot (e.g., a one-hour time slot) and do counts for each partner and campaign. Clicks and conversions IXI product lookup, e.g., based on the IP address or cookie (zip+4) © 2012 Equifax Inc. 5
6.
Background Query views
or clicks or conversions of an audience By time: hour, day, week, month or whole campaign By partner (client or campaign) By offer By goal By result: achieved or not By source: publishers By creative Glean insights from IXI products WealthComplete Deposits 8 FinancialCohorts 61 Income360 11 Discretionary Spending 9 EconomicCohorts 71 Ability to Pay (ATP) 4 FICO scores 6 Economic Spectrum 17 210 © 2012 Equifax Inc. 6
7.
Background: Examples of
IXI products WealthComplete Deposits AB Tiers WealthComplete Deposits # HHs % 1 $250K+ 5,754,096 4.78% 2 $100K - $250K 12,536,735 10.42% 3 $50K - $100K 13,288,411 11.04% 4 $25K - $50K 14,722,193 12.23% 5 $10K - $25K 18,336,190 15.24% Ability to Pay (ATP) ED 6 $2.5K - $10K 20,246,716 16.82% Tiers Data Labels # HHs % 7 $0.01 - $2.5K 34,993,949 29.08% 1, 2, 3, 4, 5, 6, 7 Highest Ability to Pay: Top 20% 22,249,393 18.49% 8 $0 467,498 0.39% 8, 9, 10, 11, 12 High Ability to Pay 32,916,824 27.36% Total 120,345,788 100.00% 13, 14, 15, 16, 17 Moderate Ability to Pay 40,012,323 33.25% 18, 19, 20, 21, 22, 23, 24 Lowest Abillity to Pay: Bottom 20% 25,144,689 20.90% Total 120,323,229 100.00% © 2012 Equifax Inc. 7
8.
Background Rough sketch
(or mockup) of the UI for AI. © 2012 Equifax Inc. 8
9.
Platform Architecture Audience
Intel © 2012 Equifax Inc. 9
10.
Technology: High-Level overview
Logs: Queuing component (Kafka) Producer: Stream the log files into Kakfa. Consumer: Parse the log files in Kafka and do counts. Summarize: Key-value storage for the counters Scalable with fast lookups. Metadata storage: Partner, client, campaign Schema-less Expresses relationships between entities easily Fast lookups. Filtering API A UI © 2012 Equifax Inc. 10
11.
Technology: High-Level overview
Queuing component: Kafka Apache Kafka is a distributed publish-subscribe messaging system written in Scala and Java. incubator.apache.org/kafka/ Data storage for the counters: Redis Advanced key-value store or data structure server written in C. redis.io Metadata storage: Neo4j Java Graph database (neo4j.org) High availability cluster option Filtering API: PHP with ZeroMQ © 2012 Equifax Inc. 11
12.
RDMS Design Audience
Intel © 2012 Equifax Inc. 12
13.
RDMS Design ©
2012 Equifax Inc. 13
14.
RDMS Design Use
cases Mr. Jones from Razorfish would like to see clicks for all offers, all sources and all creatives for Dec. 29. Mr. Jones from Razorfish would like to see ATP for clicks for all offers, all sources and all creatives for Dec. 29. Mr. Jones from Razorfish would like to see ATP for clicks for offer 1, goal 1, all sources and creative 79 for Dec. 29. First case: Get sum of clicks day 2012-12-29, partner 1, goal 1 SELECT SUM(count) AS sum FROM ai201212 WHERE partnerId = 1 AND day = 29 AND goal = 1; Second use case: Get ATP day 2012-12-29, partner 1, goal 1 SELECT ti.productTierId, SUM(ti.tierHH) AS tierHH FROM ai201112 ai INNER JOIN tiers201212 ti ON ai.aiId = ti.aiId WHERE ai.partnerId = 1 AND ai.day = 29 AND ai.goal = 1 AND ti.productId = 1 GROUP BY ti.productTierId; © 2012 Equifax Inc. 14
15.
RDMS Design Third
use case: Get ATP day 2012-12-29, partner 1, goal 1 creativeId 79 SELECT ti.productTierId, SUM(ti.tierHH) AS tierHH, SUM(ai.count) AS sum, (SUM(ti.tierHH)/SUM(ai.count))*100 AS percent FROM ai201112 ai INNER JOIN tiers201212 ti ON ai.aiId = ti.aiId WHERE ai.partnerId = 1 AND ai.day = 29 AND ai.goal = 1 AND ti.productId = 1 AND ai.creativeId = 79 GROUP BY ti.productTierId; © 2012 Equifax Inc. 15
16.
Metadata to Neo4j
Audience Intel © 2012 Equifax Inc. 16
17.
Metadata to Neo4j
© 2012 Equifax Inc. 17
18.
Metadata to Neo4j
© 2012 Equifax Inc. 18
19.
Metadata to Neo4j
© 2012 Equifax Inc. 19
20.
Metadata to Neo4j
© 2012 Equifax Inc. 20
21.
Metadata to Neo4j
© 2012 Equifax Inc. 21
22.
Metadata to Neo4j
© 2012 Equifax Inc. 22
23.
Metadata to Neo4j:
Users, Firms & Data Sources © 2012 Equifax Inc. 23
24.
Metadata to Neo4j
© 2012 Equifax Inc. 24
25.
Storing & Retrieving
Data in Redis © 2012 Equifax Inc. 25
26.
What is Redis?
Redis is an in-memory, advanced key-value store. “in-memory”: like a cache (e.g., memcached). “advanced”: stores complex data structures. It has a “hash table” data structure—array storage. Key: currentInventoryOfFruit Value: [“apples” : 2, “oranges” : 12, “tomatoes” : 6] Increment: $redis->hIncrBy(‘currentInventoryOfFruit’, ‘kiwi’, 2); Value: [“apples” : 2, “oranges” : 12, “kiwi” : 2, “tomatoes” : 6] Increment: $redis->hIncrBy(‘currentInventoryOfFruit’, ‘kiwi’, 2); Value: [“apples” : 2, “oranges” : 12, “kiwi” : 4, “tomatoes” : 6] © 2012 Equifax Inc. 26
27.
What is Redis?
One of the unique aspects of Redis in the world of key-value caches is that Redis adds a hash-table data structure. strings -- binary safe data hashes -- maps between string fields and string values lists -- lists of strings, sorted by insertion order sets -- unordered collection of unique strings sorted sets -- numerically ordered collection of unique strings Functionality For a list, one can push and pop strings off a list. For a hash, one can sort, increment, get all fields or get one field. For a set, one can sort and perform operations such as intersections and unions. For sorted sets, in addition to the things mentioned above, one can use the numeric score to retrieve a subset. Redis also supports performing parallel queries, using MULTI/EXEC commands, and a PUBLISH and SUBSCRIBE feature set for posting to channels and listening for messages posted to channels. © 2012 Equifax Inc. 27
28.
How we store
our Data in Redis Key: STRING Verbose key: /ai/partner:1/client:1/campaign:1/time:2012.06.01/offer:1/goal:1/result:1/source:1/ creative:1/ Compact key: /ai/1/1/1/time:2012.06.01/1/1/1/1/1/ Value: 326 Need a legend: SSET Key: /ai/legend:key/partner:1/client:1/campaign:1 Value: [{“0”:”partner”},{“1”:”client”},{“2”:”campaign”},{“3”:”time”},{“4”:”offer”}, {“5”:”goal”}, {“6”:”resultId”}, {“7”:”sourceId”}, {“8”:”creativeId”}] Key: STRING (dog visits) Verbose key: /ai/partner:1/client:1/campaign:1/time:2012.06.01/dog:Rusty/breed:23/color:brn/ weight:36/height:24/ Compact key: /ai/1/1/1/2012.06.01/Rusty/23/brn/36/24/ Value: 3 © 2012 Equifax Inc. 28
29.
Legend for /0246/-/c1/
(partner/client/campaign) [{ 0 : "partner" }, { 1 : "client" }, { 2 : "campaign" }, { 3 : "time" }, { 4 : "offer" }, { 5 : "goal" }, { 6 : "result" }, { 7 : "source" }, { 8 : "creative" } ] © 2012 Equifax Inc. 29
30.
Querying the Data
in Redis Created indexes for efficient lookups. We are storing hour keys. We created day indexes (e.g., 2012.07.11) for the hour keys. We are storing day keys. We created another index (e.g., ‘days’) for the day keys Hour keys Key: /ai/key/1/1/1/2012.07.11.08/1/1/1/1/1/ Value: 26 Index: /ai/idx/1/1/1/2012.07.11/1/1/1/1/1/ Value (SET): [‘/ai/key/1/1/1/2012.07.11.08/1/1/1/1/1/’, ‘/ai/key/1/1/1/2012.07.11.09/1/1/1/1/1/’, ‘/ai/key/1/1/1/2012.07.11.10/1/1/1/1/1/’] © 2012 Equifax Inc. 30
31.
Querying the Data
in Redis Day keys Key: /ai/key/1/1/1/2012.07.11/1/1/1/1/1/ Value: 322 Index: /ai/idx/1/1/1/day/1/1/1/1/1/ Value (SET): [‘/ai/key/1/1/1/2012.07.09/1/1/1/1/1/’, ‘/ai/key/1/1/1/2012.07.10/1/1/1/1/1/’, ‘/ai/key/1/1/1/2012.07.11/1/1/1/1/1/’] Queries are typically for many days and for many dimensions. We do an sUnion on the indexes. Performs the union between N sets (e.g., 1,000 sets) and returns an array of keys. Step one: Does the item exist in an index? Get all of the keys in the indexes. We build a list of the indexes that we want to query, and we query the indexes. – /ai/idx/1/1/1/day/1/1/1/1/1/ – /ai/idx/1/1/1/day/2/1/1/1/1/ – /ai/idx/1/1/1/day/3/1/1/1/1/ We do an sUnion to get all of the keys in the index sets. We can filter the days before get the keys. We do an mGet to get all of the key values, and we sum up all of the keys for the total for a given day (or a given hour). © 2012 Equifax Inc. 31
32.
Querying the Data
in Redis Secret sauce for queries sUnion on the indexes mGet (or getMultiple) on the keys mGet An mGet gets the values of all the specified keys, and returns an array of values. If one or more keys does not exist, the array will contain FALSE at the position of the key. © 2012 Equifax Inc. 32
33.
Query Params [{"name":"partner","value":"0246"},
{"name":"client", "value":"-"}, {"name":"campaign", "value":“c1"}, {"name":"source", "options":[{"value":"1"}, {"value":"2"}]}, {"name":"time", "options":[{"value":""}, {"value":"2012"}, {"value":"201202"}, {"value":"20120201"}, {"value":"20120202"}, {"value":"20120203"}, {"value":"20120204"}, {"value":"20120205"}, {"value":"20120206"}, {"value":"20120207"}, {"value":"20120208"}, {"value":"20120209"}, {"value":"20120210"}, {"value":"20120211"}, {"value":"20120212"}, {"value":"20120213"}, {"value":"20120214"}, {"value":"20120215"}, {"value":"20120216"}, {"value":"20120217"}, {"value":"20120218"}, {"value":"20120219"}, {"value":"20120220"}, {"value":"20120221"}, {"value":"20120222"}, {"value":"20120223"}, {"value":"20120224"}, {"value":"20120225"}, {"value":"201204"}, {"value":"20120402"}, {"value":"20120404"}, {"value":"20120405"}, {"value":"201206"}, {"value":"20120602"}, {"value":"20120605"}, {"value":"20120621"}, {"value":"201207"}, {"value":"20120702"}]}, {"name":"result", "options":[{"value":"0"}, {"value":"1"}]}, {"name":"goal", "value":"1"}, {"name":"creative", "value":"1"}, {"name":"offer", "options":[{"value":"4"}, {"value":"5"}, {"value":"6"}, {"value":"7"}]}, {"name":"level", "value":"day"}] © 2012 Equifax Inc. 33
34.
Build a list
of indexes partner = 0246 campaign = c1 index = days offer = 4, 5, 6, 7 goal = 1 result = 0, 1 source = 1, 2 creative = 1 /ai/idx/0246/-/c1/days/4/1/0/1/1/ /ai/idx/0246/-/c1/days/5/1/0/1/1/ /ai/idx/0246/-/c1/days/6/1/0/1/1/ /ai/idx/0246/-/c1/days/7/1/0/1/1/ /ai/idx/0246/-/c1/days/4/1/1/1/1/ /ai/idx/0246/-/c1/days/5/1/1/1/1/ /ai/idx/0246/-/c1/days/6/1/1/1/1/ /ai/idx/0246/-/c1/days/7/1/1/1/1/ /ai/idx/0246/-/c1/days/4/1/0/2/1/ /ai/idx/0246/-/c1/days/5/1/0/2/1/ /ai/idx/0246/-/c1/days/6/1/0/2/1/ /ai/idx/0246/-/c1/days/7/1/0/2/1/ /ai/idx/0246/-/c1/days/4/1/1/2/1/ /ai/idx/0246/-/c1/days/5/1/1/2/1/ /ai/idx/0246/-/c1/days/6/1/1/2/1/ /ai/idx/0246/-/c1/days/7/1/1/2/1/ © 2012 Equifax Inc. 34
35.
Querying the Data
in Redis: One server (2 GB of RAM) The results with one server Less than 100,000 rows: <1second 800,000 rows: about 4 seconds 9,000,000 rows about 45 seconds 9 million max rows with one server Removed the 9-million-row limitation by using a compute cluster 10 servers Divide and conquer strategy Three primary lookups: Basic counts Product counts (percent distribution) Distinct values for our form multi-select list. © 2012 Equifax Inc. 35
36.
Quick distinct values
for our multi-select lists Generate form elements: We have HTML select lists. We need to do a “select distinct(offer)”. We write to a ‘distincts’ key so that we do not have look up the values and calculate the distinct values in code. Distinct values (for the form): SET Key: /ai/distincts:filter/partner:1/client:1/campaign:1/time Value: [“2012.06.01”,”2012.06.02”,”2012.06.03”] Key: /ai/distincts:filter/partner:1/client:1/campaign:1/offer Value: [“offer1”,”offer2”,”offer3”,”offer4”] SETs are unique values so duplicates are dropped. © 2012 Equifax Inc. 36
37.
Redis in production
Redis stores the data on disk in the event of a system failure (or a reboot). Two backup modes: append-only file and snapshots. The default is to snapshot your data every N seconds if there are at least M changes. – after 900 sec (15 min) if at least 1 key changed – after 300 sec (5 min) if at least 10 keys changed – after 60 sec if at least 10000 keys changed The default is append-only file, with fsync set to every second. With fsync set to every second, performance is still very good. Supports master-slave replication. We write to the master and read from the slave. Redis works best when you re-use open connections. © 2012 Equifax Inc. 37
38.
Building a compute
cluster © 2012 Equifax Inc. 38
39.
Building a compute
cluster: Outline Problem: large memory-hungry queries Solution: Shard the query on the biggest dimension Employing a message-passing paradigm Overview of architecture Dealing with long-running processes in PHP pcntl_fork() – creates a child process with a new PID. PHP forker – uses C to encapsulate PHP Using ZeroMQ sockets: PUSH-PULL Challenges © 2012 Equifax Inc. 39
40.
Problem: large memory-hungry
© 2012 Equifax Inc. 40 queries
41.
Problem: large memory-hungry
queries partner = 0246 campaign = c1 index = days offer = 4, 5, 6, 7 goal = 1 result = 0, 1 source = 1, 2 creative = 1 1. /ai/idx/0246/-/c1/days/4/1/0/1/1/ 2. /ai/idx/0246/-/c1/days/5/1/0/1/1/ 3. /ai/idx/0246/-/c1/days/6/1/0/1/1/ 4. /ai/idx/0246/-/c1/days/7/1/0/1/1/ 5. /ai/idx/0246/-/c1/days/4/1/1/1/1/ 6. /ai/idx/0246/-/c1/days/5/1/1/1/1/ 7. /ai/idx/0246/-/c1/days/6/1/1/1/1/ 8. /ai/idx/0246/-/c1/days/7/1/1/1/1/ 9. /ai/idx/0246/-/c1/days/4/1/0/2/1/ 10. /ai/idx/0246/-/c1/days/5/1/0/2/1/ 11. /ai/idx/0246/-/c1/days/6/1/0/2/1/ 12. /ai/idx/0246/-/c1/days/7/1/0/2/1/ 13. /ai/idx/0246/-/c1/days/4/1/1/2/1/ 14. /ai/idx/0246/-/c1/days/5/1/1/2/1/ 15. /ai/idx/0246/-/c1/days/6/1/1/2/1/ 16. /ai/idx/0246/-/c1/days/7/1/1/2/1/ © 2012 Equifax Inc. 41
42.
Solution: Shard the
query © 2012 Equifax Inc. 42
43.
Solution: Shard the
query on the biggest dimension function getMaxPosition(array $positionArray) { $pCount = array(); if (isset($positionArray['position4'])) { $position4 = $positionArray['position4']; if (is_array($position4)) { $pCount['position4'] = count($position4); } } if (isset($positionArray['position5'])) { $position5 = $positionArray['position5']; if (is_array($position5)) { $pCount['position5'] = count($position5); } } if (isset($positionArray['position6'])) { $position6 = $positionArray['position6']; if (is_array($position6)) { $pCount['position6'] = count($position6); } } // Find the max array size $max = 0; foreach ($pCount as $key => $value) { if ($value > $max) { $max = $value; $maxPosition = $key; } } return $maxPosition; } /** * Shards an array into smaller pieces. * @param array $positionArray The position that needs to be sharded. * @return array */ function shardPosition($positionArray) { © 2012 Equifax Inc. 43 $shardArray = array_chunk($positionArray, 1); return $shardArray; }
44.
Message passing paradigm
© 2012 Equifax Inc. 44
45.
Message passing Erlang
Akka for Scala & Java Threading and locking Is the code thread safe when updating data? Locking: creates contention for the lock and waiting. Message passing to autonomous processes The state of the process can be blocking (synchronous) or non-blocking (asynchronous). In the case of a corrupt state, kill and start, again. Fail early; fail often. Auto-restart on failure (or timeout…). © 2012 Equifax Inc. 45
46.
Solution: Build a
compute cluster to process the queries using ZeroMQ © 2012 Equifax Inc. 46
47.
What is zeroMQ
http://zguide.zeromq.org/php:all It’s a networking library for message passing. It's fast enough to be the fabric for clustered products. Its asynchronous I/O model gives you scalable multicore applications, built as asynchronous message-processing tasks. My zeroMQ sockets: REQ - REP (syncronous/blocking with a timeout) REQ - ROUTER (asyncronous) PUSH – PULL (fan out, fan in) PUB – SUB © 2012 Equifax Inc. 47
48.
Overview of architecture
© 2012 Equifax Inc. 48
49.
Overview of architecture
zeromq.c0.uber = "ash-uhapsyslog01.meshdomain.ixicorp.com" zeromq.c0.cbroker = "ash-uhapsyslog01.meshdomain.ixicorp.com" zeromq.c0.ping = 5500 zeromq.c0.frontend = 5550 zeromq.c0.map = 5600 zeromq.c0.reduce = 5650 zeromq.c0.kill = 7800 zeromq.c0.state = 7500 zeromq.c1.uber = "ash-uhapsyslog01.meshdomain.ixicorp.com" zeromq.c1.cbroker = "ash-uhapsyslog01.meshdomain.ixicorp.com" zeromq.c1.ping = 5501 zeromq.c1.frontend = 5551 zeromq.c1.map = 5601 zeromq.c1.reduce = 5651 zeromq.c1.kill = 7801 zeromq.c1.state = 7500 ping: REQ – REP frontend: REQ – REP map: PUSH - PULL reduce: PUSH - PULL kill: PUB - SUB state: REQ - ROUTER © 2012 Equifax Inc. 49
50.
Overview of architecture
© 2012 Equifax Inc. 50
51.
Overview of architecture
Supervisor (uber) and Cluster brokers (cbroker) root 58690 0.0 0.5 257900 11100 ? Ssl 06:05 0:11 php /opt/aiclusters/uber.php u0 root 58703 0.1 1.1 336064 22604 ? Ssl 06:05 0:39 php /opt/aiclusters/cbroker.php p0 c0 root 58714 0.0 1.0 335296 21916 ? Ssl 06:05 0:07 php /opt/aiclusters/cbroker.php p0 c1 root 58725 0.0 0.5 325544 11384 ? Ssl 06:05 0:04 php /opt/aiclusters/cbroker.php p0 c2 root 58736 0.0 0.5 325544 11348 ? Ssl 06:05 0:02 php /opt/aiclusters/cbroker.php p0 c3 root 58750 0.0 0.5 325532 11232 ? Ssl 06:05 0:01 php /opt/aiclusters/cbroker.php p0 c4 root 58761 0.0 0.5 325544 11288 ? Ssl 06:05 0:07 php /opt/aiclusters/cbroker.php p0 c5 root 58769 0.0 0.5 325544 11224 ? Ssl 06:05 0:02 php /opt/aiclusters/cbroker.php p0 c6 root 58777 0.0 0.5 325532 11312 ? Ssl 06:05 0:04 php /opt/aiclusters/cbroker.php p0 c7 Worker server 1: Server broker and server workers for each virtual cluster root 24532 0.0 0.4 284460 9832 ? Ssl 06:07 0:01 php /opt/aiclusters/sbroker.php p0 1 root 24540 0.8 11.9 514392 244012 ? Ssl 06:07 5:42 php /opt/aiclusters/sworker.php c0 root 24546 0.0 5.2 381264 106616 ? Ssl 06:07 0:36 php /opt/aiclusters/sworker.php c1 root 24555 0.0 0.4 284716 9976 ? Ssl 06:07 0:00 php /opt/aiclusters/sworker.php c2 root 24564 0.0 0.4 284716 9816 ? Ssl 06:07 0:00 php /opt/aiclusters/sworker.php c3 root 24573 0.0 0.4 284716 9812 ? Ssl 06:07 0:00 php /opt/aiclusters/sworker.php c4 root 24582 0.0 0.4 284716 9816 ? Ssl 06:07 0:00 php /opt/aiclusters/sworker.php c5 root 24592 0.0 0.4 284716 9808 ? Ssl 06:07 0:00 php /opt/aiclusters/sworker.php c6 root 24601 0.0 0.4 284716 9820 ? Ssl 06:07 0:00 php /opt/aiclusters/sworker.php c7 Worker server 2: Server broker and server workers for each virtual cluster root 46997 0.0 0.4 284460 9824 ? Ssl 06:07 0:00 php /opt/aiclusters/sbroker.php p0 2 root 47005 0.7 25.5 797800 522688 ? Ssl 06:07 5:04 php /opt/aiclusters/sworker.php c0 root 47011 0.0 5.0 378192 103616 ? Ssl 06:07 0:28 php /opt/aiclusters/sworker.php c1 root 47021 0.0 0.4 284716 10012 ? Ssl 06:07 0:00 php /opt/aiclusters/sworker.php c2 root 47029 0.0 0.4 284716 9812 ? Ssl 06:07 0:00 php /opt/aiclusters/sworker.php c3 root 47041 0.0 0.4 284716 9816 ? Ssl 06:07 0:00 php /opt/aiclusters/sworker.php c4 root 47048 0.0 0.4 284716 9828 ? Ssl 06:07 0:00 php /opt/aiclusters/sworker.php c5 root 47059 0.0 0.4 284716 9820 ? Ssl 06:07 0:00 php /opt/aiclusters/sworker.php c6 root 47068 0.0 0.4 284716 9816 ? Ssl 06:07 0:00 php /opt/aiclusters/sworker.php c7 © 2012 Equifax Inc. 51
52.
Creating long running
processes with PHP © 2012 Equifax Inc. 52
53.
Long running processes
in PHP pcntl_fork() – creates a child process with a new PID. system() or passthru() When you try to run a script in the background, it creates a zombie process. PHP forker – uses C to encapsulate php-cli https://code.google.com/p/php-forker/ php_forker demonizes a php-cli process that runs a console php script Processes run for weeks and months. © 2012 Equifax Inc. 53
54.
Long running processes
$theScript = '/usr/local/sbin/php-forker ' . $thisDirectory . $script; if (isset($param1)) { $theScript .= ' ' . $param1; } if (isset($param2)) { $theScript .= ' ' . $param2; } $escapedScript = escapeshellcmd($theScript); $logger->info('Executing: ' . $escapedScript); $result = exec($escapedScript, $output); if ($result != 'Ok') { $theOutput = var_export($output, true); $logger->err('php-forker error launching: ' . $theScript . ' Output: ' . $theOutput); echo $theOutput . "n"; } © 2012 Equifax Inc. 54
55.
Supervisor © 2012
Equifax Inc. 55
56.
$connectionsEndpoint = 'tcp://*:'
. $connectionsPort; $monitorEndpoint = 'tcp://*:' . $monitorPort; $statusEndpoint = 'tcp://*:' . $statePort; $podControlEndpoint = 'tcp://*:' . $podControlPort; $podRestartEndpoint = 'tcp://*:' . $podRestartPort; $statusArray = array(); foreach ($connections as $key => $value) { $statusArray[$key] = array('availability' => 'active', 'status' => 'started'); } $context = new ZMQContext(); // Socket for connections output $mqConnections = new ZMQSocket($context, ZMQ::SOCKET_REP); $mqConnections->bind($connectionsEndpoint); // This socket receives 'start' and 'kill' messages from pod control. $restart = new ZMQSocket($context, ZMQ::SOCKET_ROUTER); $restart->bind($podRestartEndpoint); // Socket for status messages. $status = new ZMQSocket($context, ZMQ::SOCKET_ROUTER); $status->bind($statusEndpoint); // This socket publishes 'start' and 'kill' messages to node brokers. $podControl = new ZMQSocket($context, ZMQ::SOCKET_PUB); $podControl->bind($podControlEndpoint); © 2012 Equifax Inc. 56 Supervisor
57.
$poll = new
ZMQPoll(); $poll->add($mqConnections, ZMQ::POLL_IN); $poll->add($status, ZMQ::POLL_IN); $poll->add($restart, ZMQ::POLL_IN); $read = $write = array(); while(true) { // One second timeout $events = $poll->poll($read, $write, 1000); if($events) { foreach ($read as $socket) { if ($socket === $status) { $zmsg = new Zmsg($status); $zmsg->recv(); $message = $zmsg->body(); // array('cluster' => $clusterId, 'status' => $message) $statusMessage = Zend_Json::decode($message, true); $statusArray = updateStatus($statusArray, $statusMessage); $logger->info("Received status message: $message"); // Publishing status message $monitor->send($message); //printf ("Received status message: %s %s", $message, PHP_EOL); } elseif ($socket === $mqConnections) { $message = $socket->recv(); $connectionsArray = getConnections($connections, $statusArray); $jsonConnections = Zend_Json::encode($connectionsArray); $mqConnections->send($jsonConnections); $connectionsCount = count($connectionsArray); $logger->info("Received $message, Sent $connectionsCount zeroMQ connections"); } elseif ($socket == $restart) { © 2012 Equifax Inc. 57 Supervisor
58.
Supervisor: Publish to
Server Brokers /** * Publish a 'start' or 'kill' message to the node brokers. * @param ZMQSocket $podControl * @param string $clusterId * @param string $action 'kill' or 'start' * @param Zend_Log $logger */ function pubMessageToNodeBrokers($podControl, $clusterId, $action, $logger) { // Publish a message to sbroker on the pod control port. $controlArray = array('cluster' => $clusterId, 'action' => $action); $json = Zend_Json::encode($controlArray); //$thePayload = '{"cluster":"' . $clusterId . '","action":"' . $action . '"}'; $podControl->send($json); $message = "On the pod control port, the supervisor published to $clusterId the following message: $action"; $logger->info($message); } © 2012 Equifax Inc. 58
59.
Cluster Broker ©
2012 Equifax Inc. 59
60.
// Ping $context
= new ZMQContext(); $ping = $context->getSocket(ZMQ::SOCKET_REP); $pingEndpoint = 'tcp://*:' . $connections['ping']; $ping->bind($pingEndpoint); // Receives the query from the frontend API. $frontend = $context->getSocket(ZMQ::SOCKET_REP); $frontendEndpoint = 'tcp://*:' . $connections['frontend']; $frontend->bind($frontendEndpoint); // Kill this broker $controller = $context->getSocket(ZMQ::SOCKET_SUB); $controlEndpoint = 'tcp://' . $connections['cbroker'] . ':' . $connections['kill']; $controller->connect($controlEndpoint); $controller->setSockOpt(ZMQ::SOCKOPT_SUBSCRIBE, ""); // Status messages $statusEndpoint = 'tcp://' . $connections['uber'] . ':' . $connections['state']; sendStatus($context, $statusEndpoint, $clusterId, 'waiting'); // Socket for map $map = new ZMQSocket($context, ZMQ::SOCKET_PUSH); // Use the $nodeCount $map->setSockOpt(ZMQ::SOCKOPT_HWM, $nodeCount); $mapEndpoint = 'tcp://*:' . $connections['map']; $map->bind($mapEndpoint); // Socket for reduce $reduce = new ZMQSocket($context, ZMQ::SOCKET_PULL); $reduceEndpoint = 'tcp://*:' . $connections['reduce']; $reduce->bind($reduceEndpoint); © 2012 Equifax Inc. 60 Cluster broker
61.
$read = $write
= array(); while(true) { $poll = new ZMQPoll(); $poll->add($ping, ZMQ::POLL_IN); $poll->add($frontend, ZMQ::POLL_IN); $poll->add($controller, ZMQ::POLL_IN); $poll->add($reduce, ZMQ::POLL_IN); $events = $poll->poll($read, $write, 1000); // 1 second interval if ($events > 0) { foreach ($read as $socket) { if($socket === $ping) { $msg = $ping->recv(); $logger->info($clusterId . ': Sending pong'); $ping->send('pong'); } elseif ($socket === $frontend) { © 2012 Equifax Inc. 61 Cluster broker
62.
} elseif ($socket
=== $frontend) { $shardCount = 0; $reduceCount = 0; $reduceArray = array(); $timeArray = array(); $logger->info($cBrokerId . ': frontend receiving message'); $payload = $frontend->recv(); sendStatus($context, $statusEndpoint, $clusterId, 'processing'); $logger->info($cBrokerId . ': frontend set cluster to processing'); $paramsArray = Zend_Json::decode($payload, true); // get redis from the payload and unset $redisConnection = array('redis' => $paramsArray['redis']); unset($paramsArray['redis']); // Time array. Get rid of time values that are not day values. $theTimeArray = array(); $timeArray = $paramsArray['time']; foreach ($timeArray as $timeSlot) { $stringLength = strlen($timeSlot); if ($stringLength == 8) { $theTimeArray[] = $timeSlot; } } $paramsArray['time'] = $theTimeArray; // Shard on the largest array. $maxPosition = getMaxPosition($paramsArray); if ($maxPosition == null) { // None of the elements are arrays $logger->info($cBrokerId . ': frontend, none of the search criteria are arrays. Cannot be sharded!'); $finalArray = array_merge($paramsArray, $redisConnection); $workJson = Zend_Json::encode($finalArray); $shardCount++; $map->send($workJson); © 2012 Equifax Inc. 62 Cluster broker
63.
} else {
$maxPositionArray = $paramsArray[$maxPosition]; $shardedArray = shardPosition($maxPositionArray); foreach ($shardedArray as $shard) { $workerArray = array(); foreach ($paramsArray as $key => $value) { if ($key != $maxPosition) { $workerArray[$key] = $value; } else { $workerArray[$key] = $shard; } } $finalArray = array_merge($workerArray, $redisConnection); $workJson = Zend_Json::encode($finalArray); $shardCount++; $logger->info($cBrokerId . ': Shard count is ' . $shardCount); $map->send($workJson); } } // unset variables unset($paramsArray); unset($theTimeArray); unset($shardedArray); unset($finalArray); © 2012 Equifax Inc. 63 Cluster broker
64.
} elseif ($socket
=== $reduce) { $jsonResult = $reduce->recv(); $result = Zend_Json::decode($jsonResult, true); $reduceCount++; $message = $cBrokerId . ': Received reduce. Reduce count ' . $reduceCount . '; shard count ' . $shardCount; $logger->info($message); if ($reduceCount < $shardCount) { $reduceArray[] = $result; } else { $reduceArray[] = $result; // Reduce it $message = $cBrokerId . ': Sending reduce array to reduce action'; $logger->info($message); $sendingResults = array(); if ($type == 'normal') { $sendingResults = reduceAction($reduceArray); } elseif ($type == 'distincts') { $sendingResults = reduceDistincts($reduceArray); } else { $sendingResults = reduceProduct($reduceArray); } $jsonofied = Zend_Json::encode($sendingResults); unset($sendingResults); $frontend->send($jsonofied); sendStatus($context, $statusEndpoint, $clusterId, 'waiting'); gc_collect_cycles(); } } elseif ($socket === $controller) { © 2012 Equifax Inc. 64 Cluster broker
65.
function reduceAction(array $reduceArray)
{ $timeArray = array(); foreach ($reduceArray as $valuesArray) { if (!empty($valuesArray)) { foreach ($valuesArray as $key => $value) { if (!empty($value)) { if (isset($timeArray[$key])) { $timeArray[$key] = $timeArray[$key] + $value; } else { $timeArray[$key] = $value; } } } } } if (empty($timeArray)) { return 'No data found!'; } unset($valuesArray); return $timeArray; } © 2012 Equifax Inc. 65 Cluster broker
66.
Front-end API ©
2012 Equifax Inc. 66
67.
function _sendPayload($ctx, $endpoint,
array $request) { $client = $ctx->getSocket(ZMQ::SOCKET_REQ); //$logger->info('sendPayload called: ' . $endpoint); $client->connect($endpoint); $json = Zend_Json::encode($request); $client->send($json); $poll = new ZMQPoll(); $poll->add($client, ZMQ::POLL_IN); $readable = $writable = array(); $timeout = 180000; // Three minutes in milliseconds $events = $poll->poll($readable, $writable, $timeout); //$logger->info('events poll is finished.'); $response = null; if ($events) { //$logger->info('There is an event.'); foreach($readable as $sock) { if ($sock == $client) { $response = $client->recv(); } else { $response= null; } } } © 2012 Equifax Inc. 67 Cluster broker
68.
Challenges © 2012
Equifax Inc. 68
69.
Challenges Selling message
passing as an alternative to threading. We were having a memory problem. It was not cpu bound. Learning zeroMQ Worrying about making mistakes. Do I have the right model for the task. Getting the application into production With two pods Getting dev and test clusters. Selling the AI application. © 2012 Equifax Inc. 69
70.
Questions? Comments? Observations?
J. David Mitchell LinkedIn: david@dmitchell.biz Twitter: pingdavid © 2012 Equifax Inc. 70
Download now