SlideShare a Scribd company logo
1 of 29
Download to read offline
O C T O B E R 1 3 - 1 6 , 2 0 1 6 • A U S T I N , T X
Event Processing and Data Analytics with Lucidworks Fusion
Kiran Chitturi
Software Engineer, Lucidworks
3
• How to capture/record user events ?
• How to use events/signals for recommendations ?
• How to produce reports/analytics from user events ?
• What type of recommendations can be generated for different user
types?
Problem Statement
4
• Library to collect user events from client-side tier of websites and apps (https://
github.com/snowplow/snowplow-javascript-tracker)
• Open source equivalent for enterprise analytics
• Sends events using tracking pixel
• Signals API acts as a collector for Snowplow events
• Tracks page views, page pings, links and any custom configured events
• https://github.com/snowplow/snowplow/wiki/javascript-tracker
Event collection - Snowplow JS tracker
6
• Examples:
• page-view, query, search-click, add-to-cart, rating
• Signals Schema:
• required fields: type
• additional properties can be specified in ‘params’ map
• Special treatment for fields ‘docId’, ‘userId’, ‘query’, ‘filterQueries’, ‘collection’,
‘weight’, ‘count’
• Processing logic in ‘_signals_ingest’ pipeline
Event collection - JSON payloads
test
Primary
collection
Raw
signals
collection
Aggregated
signals
collection
test_signals
test_signals
_aggr
Signals
Service
JSON
payloads
Snowplow
payloads
Solr
Signals - data flow
8
Example: page-view signal
{
"timestamp": "2015-09-14T10:12:13.456Z",
"type": "pv",
"params": {
"url": "http://www.ecommerce.com/abws-mcl008-080201"
}
}
{
"type_s": "pv",
"flag_s": "event",
"params.url_s": "http://www.ecommerce.com/abws-mcl008-080201",
"id": "62a26152-7971-406e-bf06-3df44974c220",
"timestamp_tdt": "2015-09-14T10:12:13.45Z",
"count_i": 1,
"_version_": 1515057367743463400
}
Input signal Indexed signal document
9
Example: page-view signal
{
"timestamp": "2015-09-14T10:12:13.456Z",
"type": "pv",
"params": {
"page": "Dark Gray Wool Suit",
"url": "http://www.ecommerce.com/abws-mcl008-080201",
"userId": "12891291",
"useragent_type_name_s": "Browser",
"ipAddr": "64.134.151.1"
"tz": "America/NewYork"
}
}
{
"type_s": "pv",
"params.tz_s": "America/NewYork",
"user_id_s": "12891291",
"params.page_s": "Dark Gray Wool Suit",
"tz_timestamp_txt": [
"Mon 2015-09-14 10:12:13.456 UTC"
],
"flag_s": "event",
"params.ipAddr_s": "64.134.151.1",
"params.url_s": "http://www.ecommerce.com/abws-mcl008-080201",
"id": "4b993f85-67d3-4523-b2b3-cf4e3ff2f202",
"timestamp_tdt": "2015-09-14T10:12:13.45Z",
"count_i": 1,
"_version_": 1515057643959353300
}
Input signal Indexed signal document
10
Example: click signal
{
"type": "click",
"params": {
"query": "Madden 12",
"docId": "2375201",
"userId": "abc121",
"position" : "4",
"filterQueries": [
"cat00000",
"abcat0700000",
"abcat0703000",
"abcat0703002",
"abcat0703008"
]
}
}
{
"filters_orig_ss":[
"abcat0700000",
"abcat0703000",
"abcat0703002",
"abcat0703008",
"cat00000"
],
"user_id_s":"abc121",
"query_s":"madden 12",
"type_s":"click",
"params.position_s" : "4",
"query_t": "madden 12",
"doc_id_s":"2375201",
"tz_timestamp_txt":["Tue 2015-10-13 18:33:04.012 UTC"],
"filters_s":"abcat0700000 $ abcat0703000 $ abcat0703002
$ abcat0703008 $ cat00000",
"flag_s":"event",
"query_orig_s":"Madden 12",
"id":"69c609f6-a2c1-4f89-990e-88a63e68063d",
"timestamp_tdt":"2015-10-13T18:33:04.01Z",
"count_i":1,
"_version_":1514941903557099520
}
Input signal Indexed signal document
11
• Batch processing using Apache Spark
• spark-solr library (https://github.com/LucidWorks/spark-solr)
• Types
• Simple
• Click
• EventMiner
Aggregations
12
Aggregations - data flow
Aggregation job
Aggregator
Spark
Agent
test
Primary
collection
Raw signals
collection
Worker Worker Cluster Mgr.
Spark
Aggregated signals
collection
Spark
Driver
Stores
aggregated results
Fetches raw signals
for processing
test_signals
test_signals_
aggr
13
• Simple aggregations
• Top queries
• Top clicked documents
• Most popular categories
• …
• Complex aggregations
• Click stream aggregations with decaying weights
• Generate a Co-occurence matrix for (user, docId, query) tuple
Aggregation examples
14
Example: simple aggregation
{
"type": "rating",
"params": {
"rating": “5.0”,
"source": “web”
}
},
{
"type": "rating",
"params": {
"rating": “1.0”,
"source": “web”
}
},
{
"type": "rating",
"params": {
"rating": “2.0”,
"source": “web”,
}
},
{
"type": "rating",
"params": {
"rating": “2.0”,
"source": “web”,
}
},
{
"type": "rating",
"params": {
"rating": “1.0”,
"source": “web”
}
}
API
test
Primary
collection
Raw signals
collection
Aggregated
signals
collection
test_signals
test_signals
_aggr
Solr
Signals
Service
15
Example: simple aggregation (continued)
15
test
Primary
collection
Raw signals
collection
Aggregated
signals
collection
test_signals
test_signals
_aggr
Solr
Submitted
manually or
via scheduler
Aggregation
Service
Spark
Fetches raw signals
for processing
Stores
aggregated results
{
"id" : "test_simple_aggr",
"signalTypes" : [ "rating" ],
"selectQuery" : "*:*",
"aggregator" : "simple",
"groupingFields" : "params.source_s",
"aggregates" : [ {
"type" : "stddev",
"sourceFields" : [ "params.rating_s" ],
"targetField" : "stddev_rating_d"
},
{
"type": "topk",
"sourceFields": ["params.rating_s"],
"targetField": "topk_rating_ss"
},
{
"type": "mean",
"sourceFields": ["params.rating_s"],
"targetField": "mean_position_d"
}
]
}
Aggregation
definition
job
submission
16
• Aggregated document:
Example: simple aggregation (continued)
{
"aggr_job_id_s": "b91ffdebc44d4e128a8431c2f8a3deb7",
"aggr_type_s": "simple@doc_id_s-query_s-filters_s",
"flag_s": "aggr",
"type_s": "rating",
"id": "24494dba-93a6-4fc5-bb4d-5b546c3c0c5e",
"aggr_id_s": "test_simple_aggr",
"timestamp_tdt": "2015-10-15T02:26:17.337Z",
"count_i": 5,
“grouping_key_s": "web",
"stddev_rating_d": 1.6431676725154982,
"mean_position_d": 2.2,
"values.topk_rating_ss": ["2.0", "1.0", "5.0"],
"counts.topk_rating_ss": ["2", "2", "1"],
"errors.topk_rating_ss": ["0", "0", "0"]
}
17
Example: Click aggregation
[
{
"timestamp": "2014-09-01T23:44:52.533Z",
"params": {
"query": "Sharp",
"docId": "2009324"
},
"type": "click"
},
{
"timestamp": "2014-09-05T12:25:37.420Z",
"params": {
"query": "Sharp",
"docId": "2009324"
},
"type": "click"
},
{
"timestamp": "2014-08-24T12:56:58.910Z",
"params": {
"query": "Sharp TV",
"docId": "1517163"
},
"type": "click"
},
{
"timestamp": "2015-10-25T07:18:14.722Z",
"params": {
"query": "rca",
"docId": "2877125"
},
"type": "click"
}
]
Signals indexed
and aggregated
{
"doc_id_s": "1517163",
"query_s": "sharp tv",
"weight_d": 0.000006602878329431405,
"count_i": 1
},
{
"doc_id_s": "2009324",
"query_s": "sharp",
"weight_d": 0.000016734602468204685,
"count_i": 2
},
{
“doc_id_s”: "2877125",
"query_s": "rca",
"weight_d": 0.06324164569377899,
"count_i": 1
}
aggregated
docsraw docs
18
• How to mix signals with search results ?
• Recommendation API
• Generic query pipeline configuration using 3 stage approach
• Sub-query
• Rollup-results
• Advanced-boost
Driving search relevancy
19
Boosting search results using aggregated documents
User
App
Search
query
Query-pipeline
stages
Set Params Query Solr
Raw signals
collection
Aggregated
signals
collection
test_signals
test_signals
_aggr
Recommendation
Stages
test
Primary
collection
1. Query aggregated documents
2. Process results
3. Add parameters to the request
Search
response
20
21
• Calculate Co-occurence matrix for tuples based on sessions
• Example: (userId, query, docId)
• Construct DAG from matrix data
• Recommendations are powered from Graph at query time
• Increases diversity in recommendations
• See https://lucidworks.com/blog/2015/08/31/mining-events-
recommendations/
Event Miner aggregation
22
Graph Navigation - Example Query
23
Graph Navigation - Example Query
24
Graph Navigation - Example Query
25
Graph Navigation - Example Query
Graph Navigation - Example Query
27
Demo
28
Using Signals
=
Modifying Your Behavior in Response to your Environment
Events & Signals
Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kiran Chitturi, Lucidworks

More Related Content

Viewers also liked

Roadmap to data driven advice michael goedhart 1v0
Roadmap to data driven advice michael goedhart 1v0Roadmap to data driven advice michael goedhart 1v0
Roadmap to data driven advice michael goedhart 1v0BigDataExpo
 
1st step LogicFlow
1st step LogicFlow1st step LogicFlow
1st step LogicFlowTomoyuki Obi
 
Rapid Infrastructure Provisioning
Rapid Infrastructure ProvisioningRapid Infrastructure Provisioning
Rapid Infrastructure ProvisioningUchit Vyas ☁
 
Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...
Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...
Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...Cigniti Technologies Ltd
 
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Lucidworks
 
D5 crazy speed web development
D5 crazy speed web developmentD5 crazy speed web development
D5 crazy speed web developmentNAVER D2
 
OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMBig Data Joe™ Rossi
 
Architecting Security and Governance Across Multi Accounts
Architecting Security and Governance Across Multi AccountsArchitecting Security and Governance Across Multi Accounts
Architecting Security and Governance Across Multi AccountsAmazon Web Services
 
How to Collect and Process Data Under GDPR?
How to Collect and Process Data Under GDPR?How to Collect and Process Data Under GDPR?
How to Collect and Process Data Under GDPR?Piwik PRO
 
E learning: kansen en risico's
E learning: kansen en risico'sE learning: kansen en risico's
E learning: kansen en risico'sJurgen Gaeremyn
 
Big Data Expo 2015 - Teradata Big Data : Just use it!
Big Data Expo 2015 - Teradata Big Data : Just use it!Big Data Expo 2015 - Teradata Big Data : Just use it!
Big Data Expo 2015 - Teradata Big Data : Just use it!BigDataExpo
 
IBM CEC Big Data 2011 06-11 final
IBM CEC Big Data 2011 06-11 finalIBM CEC Big Data 2011 06-11 final
IBM CEC Big Data 2011 06-11 finalCOMMON Europe
 
Science ABC Book
Science ABC BookScience ABC Book
Science ABC Booktjelk1
 

Viewers also liked (20)

Roadmap to data driven advice michael goedhart 1v0
Roadmap to data driven advice michael goedhart 1v0Roadmap to data driven advice michael goedhart 1v0
Roadmap to data driven advice michael goedhart 1v0
 
1st step LogicFlow
1st step LogicFlow1st step LogicFlow
1st step LogicFlow
 
Rapid Infrastructure Provisioning
Rapid Infrastructure ProvisioningRapid Infrastructure Provisioning
Rapid Infrastructure Provisioning
 
Introduction to QC
Introduction to QCIntroduction to QC
Introduction to QC
 
Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...
Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...
Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...
 
Cloud Camp Azure概要
Cloud Camp Azure概要Cloud Camp Azure概要
Cloud Camp Azure概要
 
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
 
D5 crazy speed web development
D5 crazy speed web developmentD5 crazy speed web development
D5 crazy speed web development
 
OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBM
 
EventoDadosAbertos v17ago16
EventoDadosAbertos v17ago16EventoDadosAbertos v17ago16
EventoDadosAbertos v17ago16
 
GDPR. Et alors?
GDPR. Et alors?GDPR. Et alors?
GDPR. Et alors?
 
stagerapport2.3
stagerapport2.3stagerapport2.3
stagerapport2.3
 
ecdevday4
ecdevday4ecdevday4
ecdevday4
 
Architecting Security and Governance Across Multi Accounts
Architecting Security and Governance Across Multi AccountsArchitecting Security and Governance Across Multi Accounts
Architecting Security and Governance Across Multi Accounts
 
How to Collect and Process Data Under GDPR?
How to Collect and Process Data Under GDPR?How to Collect and Process Data Under GDPR?
How to Collect and Process Data Under GDPR?
 
E learning: kansen en risico's
E learning: kansen en risico'sE learning: kansen en risico's
E learning: kansen en risico's
 
Big Data Expo 2015 - Teradata Big Data : Just use it!
Big Data Expo 2015 - Teradata Big Data : Just use it!Big Data Expo 2015 - Teradata Big Data : Just use it!
Big Data Expo 2015 - Teradata Big Data : Just use it!
 
IBM CEC Big Data 2011 06-11 final
IBM CEC Big Data 2011 06-11 finalIBM CEC Big Data 2011 06-11 final
IBM CEC Big Data 2011 06-11 final
 
okspring3x
okspring3xokspring3x
okspring3x
 
Science ABC Book
Science ABC BookScience ABC Book
Science ABC Book
 

Similar to Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kiran Chitturi, Lucidworks

Snowplow - Evolve your analytics stack with your business
Snowplow - Evolve your analytics stack with your businessSnowplow - Evolve your analytics stack with your business
Snowplow - Evolve your analytics stack with your businessGiuseppe Gaviani
 
Snowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your businessSnowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your businessyalisassoon
 
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...Altinity Ltd
 
SDKs, the good the bad the ugly - Japan
SDKs, the good the bad the ugly - JapanSDKs, the good the bad the ugly - Japan
SDKs, the good the bad the ugly - Japantristansokol
 
Why you should be using structured logs
Why you should be using structured logsWhy you should be using structured logs
Why you should be using structured logsStefan Krawczyk
 
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...Amazon Web Services
 
SH 1 - SES 8 - Stitch_Overview_TLV.pptx
SH 1 - SES 8 - Stitch_Overview_TLV.pptxSH 1 - SES 8 - Stitch_Overview_TLV.pptx
SH 1 - SES 8 - Stitch_Overview_TLV.pptxMongoDB
 
MongoDB Stich Overview
MongoDB Stich OverviewMongoDB Stich Overview
MongoDB Stich OverviewMongoDB
 
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"South Tyrol Free Software Conference
 
Webinar: Event Processing & Data Analytics with Lucidworks Fusion
Webinar: Event Processing & Data Analytics with Lucidworks FusionWebinar: Event Processing & Data Analytics with Lucidworks Fusion
Webinar: Event Processing & Data Analytics with Lucidworks FusionLucidworks
 
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...Noriaki Tatsumi
 
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy NguyenGrokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy NguyenHuy Nguyen
 
Insight User Conference Bootcamp - Use the Engagement Tracking and Metrics A...
Insight User Conference Bootcamp - Use the Engagement Tracking  and Metrics A...Insight User Conference Bootcamp - Use the Engagement Tracking  and Metrics A...
Insight User Conference Bootcamp - Use the Engagement Tracking and Metrics A...SparkPost
 
Building Progressive Web Apps for Android and iOS
Building Progressive Web Apps for Android and iOSBuilding Progressive Web Apps for Android and iOS
Building Progressive Web Apps for Android and iOSFITC
 
Going Serverless with Azure Functions
Going Serverless with Azure FunctionsGoing Serverless with Azure Functions
Going Serverless with Azure FunctionsShahed Chowdhuri
 
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Sematext Group, Inc.
 
Semantic Web & TYPO3
Semantic Web & TYPO3Semantic Web & TYPO3
Semantic Web & TYPO3André Wuttig
 
201410 2 fiware-orion-contextbroker
201410 2 fiware-orion-contextbroker201410 2 fiware-orion-contextbroker
201410 2 fiware-orion-contextbrokerFIWARE
 
LJC Conference 2014 Cassandra for Java Developers
LJC Conference 2014 Cassandra for Java DevelopersLJC Conference 2014 Cassandra for Java Developers
LJC Conference 2014 Cassandra for Java DevelopersChristopher Batey
 

Similar to Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kiran Chitturi, Lucidworks (20)

Snowplow - Evolve your analytics stack with your business
Snowplow - Evolve your analytics stack with your businessSnowplow - Evolve your analytics stack with your business
Snowplow - Evolve your analytics stack with your business
 
Snowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your businessSnowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your business
 
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
 
SDKs, the good the bad the ugly - Japan
SDKs, the good the bad the ugly - JapanSDKs, the good the bad the ugly - Japan
SDKs, the good the bad the ugly - Japan
 
Why you should be using structured logs
Why you should be using structured logsWhy you should be using structured logs
Why you should be using structured logs
 
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...
 
SH 1 - SES 8 - Stitch_Overview_TLV.pptx
SH 1 - SES 8 - Stitch_Overview_TLV.pptxSH 1 - SES 8 - Stitch_Overview_TLV.pptx
SH 1 - SES 8 - Stitch_Overview_TLV.pptx
 
MongoDB Stich Overview
MongoDB Stich OverviewMongoDB Stich Overview
MongoDB Stich Overview
 
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
 
Webinar: Event Processing & Data Analytics with Lucidworks Fusion
Webinar: Event Processing & Data Analytics with Lucidworks FusionWebinar: Event Processing & Data Analytics with Lucidworks Fusion
Webinar: Event Processing & Data Analytics with Lucidworks Fusion
 
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...
 
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy NguyenGrokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
 
Insight User Conference Bootcamp - Use the Engagement Tracking and Metrics A...
Insight User Conference Bootcamp - Use the Engagement Tracking  and Metrics A...Insight User Conference Bootcamp - Use the Engagement Tracking  and Metrics A...
Insight User Conference Bootcamp - Use the Engagement Tracking and Metrics A...
 
Building Progressive Web Apps for Android and iOS
Building Progressive Web Apps for Android and iOSBuilding Progressive Web Apps for Android and iOS
Building Progressive Web Apps for Android and iOS
 
Going Serverless with Azure Functions
Going Serverless with Azure FunctionsGoing Serverless with Azure Functions
Going Serverless with Azure Functions
 
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
 
Semantic Web & TYPO3
Semantic Web & TYPO3Semantic Web & TYPO3
Semantic Web & TYPO3
 
201410 2 fiware-orion-contextbroker
201410 2 fiware-orion-contextbroker201410 2 fiware-orion-contextbroker
201410 2 fiware-orion-contextbroker
 
The Rise of NoSQL
The Rise of NoSQLThe Rise of NoSQL
The Rise of NoSQL
 
LJC Conference 2014 Cassandra for Java Developers
LJC Conference 2014 Cassandra for Java DevelopersLJC Conference 2014 Cassandra for Java Developers
LJC Conference 2014 Cassandra for Java Developers
 

More from Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceLucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesLucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchLucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondLucidworks
 

More from Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kiran Chitturi, Lucidworks

  • 1. O C T O B E R 1 3 - 1 6 , 2 0 1 6 • A U S T I N , T X
  • 2. Event Processing and Data Analytics with Lucidworks Fusion Kiran Chitturi Software Engineer, Lucidworks
  • 3. 3 • How to capture/record user events ? • How to use events/signals for recommendations ? • How to produce reports/analytics from user events ? • What type of recommendations can be generated for different user types? Problem Statement
  • 4. 4 • Library to collect user events from client-side tier of websites and apps (https:// github.com/snowplow/snowplow-javascript-tracker) • Open source equivalent for enterprise analytics • Sends events using tracking pixel • Signals API acts as a collector for Snowplow events • Tracks page views, page pings, links and any custom configured events • https://github.com/snowplow/snowplow/wiki/javascript-tracker Event collection - Snowplow JS tracker
  • 5.
  • 6. 6 • Examples: • page-view, query, search-click, add-to-cart, rating • Signals Schema: • required fields: type • additional properties can be specified in ‘params’ map • Special treatment for fields ‘docId’, ‘userId’, ‘query’, ‘filterQueries’, ‘collection’, ‘weight’, ‘count’ • Processing logic in ‘_signals_ingest’ pipeline Event collection - JSON payloads
  • 8. 8 Example: page-view signal { "timestamp": "2015-09-14T10:12:13.456Z", "type": "pv", "params": { "url": "http://www.ecommerce.com/abws-mcl008-080201" } } { "type_s": "pv", "flag_s": "event", "params.url_s": "http://www.ecommerce.com/abws-mcl008-080201", "id": "62a26152-7971-406e-bf06-3df44974c220", "timestamp_tdt": "2015-09-14T10:12:13.45Z", "count_i": 1, "_version_": 1515057367743463400 } Input signal Indexed signal document
  • 9. 9 Example: page-view signal { "timestamp": "2015-09-14T10:12:13.456Z", "type": "pv", "params": { "page": "Dark Gray Wool Suit", "url": "http://www.ecommerce.com/abws-mcl008-080201", "userId": "12891291", "useragent_type_name_s": "Browser", "ipAddr": "64.134.151.1" "tz": "America/NewYork" } } { "type_s": "pv", "params.tz_s": "America/NewYork", "user_id_s": "12891291", "params.page_s": "Dark Gray Wool Suit", "tz_timestamp_txt": [ "Mon 2015-09-14 10:12:13.456 UTC" ], "flag_s": "event", "params.ipAddr_s": "64.134.151.1", "params.url_s": "http://www.ecommerce.com/abws-mcl008-080201", "id": "4b993f85-67d3-4523-b2b3-cf4e3ff2f202", "timestamp_tdt": "2015-09-14T10:12:13.45Z", "count_i": 1, "_version_": 1515057643959353300 } Input signal Indexed signal document
  • 10. 10 Example: click signal { "type": "click", "params": { "query": "Madden 12", "docId": "2375201", "userId": "abc121", "position" : "4", "filterQueries": [ "cat00000", "abcat0700000", "abcat0703000", "abcat0703002", "abcat0703008" ] } } { "filters_orig_ss":[ "abcat0700000", "abcat0703000", "abcat0703002", "abcat0703008", "cat00000" ], "user_id_s":"abc121", "query_s":"madden 12", "type_s":"click", "params.position_s" : "4", "query_t": "madden 12", "doc_id_s":"2375201", "tz_timestamp_txt":["Tue 2015-10-13 18:33:04.012 UTC"], "filters_s":"abcat0700000 $ abcat0703000 $ abcat0703002 $ abcat0703008 $ cat00000", "flag_s":"event", "query_orig_s":"Madden 12", "id":"69c609f6-a2c1-4f89-990e-88a63e68063d", "timestamp_tdt":"2015-10-13T18:33:04.01Z", "count_i":1, "_version_":1514941903557099520 } Input signal Indexed signal document
  • 11. 11 • Batch processing using Apache Spark • spark-solr library (https://github.com/LucidWorks/spark-solr) • Types • Simple • Click • EventMiner Aggregations
  • 12. 12 Aggregations - data flow Aggregation job Aggregator Spark Agent test Primary collection Raw signals collection Worker Worker Cluster Mgr. Spark Aggregated signals collection Spark Driver Stores aggregated results Fetches raw signals for processing test_signals test_signals_ aggr
  • 13. 13 • Simple aggregations • Top queries • Top clicked documents • Most popular categories • … • Complex aggregations • Click stream aggregations with decaying weights • Generate a Co-occurence matrix for (user, docId, query) tuple Aggregation examples
  • 14. 14 Example: simple aggregation { "type": "rating", "params": { "rating": “5.0”, "source": “web” } }, { "type": "rating", "params": { "rating": “1.0”, "source": “web” } }, { "type": "rating", "params": { "rating": “2.0”, "source": “web”, } }, { "type": "rating", "params": { "rating": “2.0”, "source": “web”, } }, { "type": "rating", "params": { "rating": “1.0”, "source": “web” } } API test Primary collection Raw signals collection Aggregated signals collection test_signals test_signals _aggr Solr Signals Service
  • 15. 15 Example: simple aggregation (continued) 15 test Primary collection Raw signals collection Aggregated signals collection test_signals test_signals _aggr Solr Submitted manually or via scheduler Aggregation Service Spark Fetches raw signals for processing Stores aggregated results { "id" : "test_simple_aggr", "signalTypes" : [ "rating" ], "selectQuery" : "*:*", "aggregator" : "simple", "groupingFields" : "params.source_s", "aggregates" : [ { "type" : "stddev", "sourceFields" : [ "params.rating_s" ], "targetField" : "stddev_rating_d" }, { "type": "topk", "sourceFields": ["params.rating_s"], "targetField": "topk_rating_ss" }, { "type": "mean", "sourceFields": ["params.rating_s"], "targetField": "mean_position_d" } ] } Aggregation definition job submission
  • 16. 16 • Aggregated document: Example: simple aggregation (continued) { "aggr_job_id_s": "b91ffdebc44d4e128a8431c2f8a3deb7", "aggr_type_s": "simple@doc_id_s-query_s-filters_s", "flag_s": "aggr", "type_s": "rating", "id": "24494dba-93a6-4fc5-bb4d-5b546c3c0c5e", "aggr_id_s": "test_simple_aggr", "timestamp_tdt": "2015-10-15T02:26:17.337Z", "count_i": 5, “grouping_key_s": "web", "stddev_rating_d": 1.6431676725154982, "mean_position_d": 2.2, "values.topk_rating_ss": ["2.0", "1.0", "5.0"], "counts.topk_rating_ss": ["2", "2", "1"], "errors.topk_rating_ss": ["0", "0", "0"] }
  • 17. 17 Example: Click aggregation [ { "timestamp": "2014-09-01T23:44:52.533Z", "params": { "query": "Sharp", "docId": "2009324" }, "type": "click" }, { "timestamp": "2014-09-05T12:25:37.420Z", "params": { "query": "Sharp", "docId": "2009324" }, "type": "click" }, { "timestamp": "2014-08-24T12:56:58.910Z", "params": { "query": "Sharp TV", "docId": "1517163" }, "type": "click" }, { "timestamp": "2015-10-25T07:18:14.722Z", "params": { "query": "rca", "docId": "2877125" }, "type": "click" } ] Signals indexed and aggregated { "doc_id_s": "1517163", "query_s": "sharp tv", "weight_d": 0.000006602878329431405, "count_i": 1 }, { "doc_id_s": "2009324", "query_s": "sharp", "weight_d": 0.000016734602468204685, "count_i": 2 }, { “doc_id_s”: "2877125", "query_s": "rca", "weight_d": 0.06324164569377899, "count_i": 1 } aggregated docsraw docs
  • 18. 18 • How to mix signals with search results ? • Recommendation API • Generic query pipeline configuration using 3 stage approach • Sub-query • Rollup-results • Advanced-boost Driving search relevancy
  • 19. 19 Boosting search results using aggregated documents User App Search query Query-pipeline stages Set Params Query Solr Raw signals collection Aggregated signals collection test_signals test_signals _aggr Recommendation Stages test Primary collection 1. Query aggregated documents 2. Process results 3. Add parameters to the request Search response
  • 20. 20
  • 21. 21 • Calculate Co-occurence matrix for tuples based on sessions • Example: (userId, query, docId) • Construct DAG from matrix data • Recommendations are powered from Graph at query time • Increases diversity in recommendations • See https://lucidworks.com/blog/2015/08/31/mining-events- recommendations/ Event Miner aggregation
  • 22. 22 Graph Navigation - Example Query
  • 23. 23 Graph Navigation - Example Query
  • 24. 24 Graph Navigation - Example Query
  • 25. 25 Graph Navigation - Example Query
  • 26. Graph Navigation - Example Query
  • 28. 28 Using Signals = Modifying Your Behavior in Response to your Environment Events & Signals