SlideShare a Scribd company logo
1 of 25
Download to read offline
Vespa, a tour
Me
Ma# Overstreet
OpenSource Connec2ons
Stuff I do:
* Solr/Elas1cSearch/Searchy-stuff
* DataStax Cassandra
* So;ware Development
What is it?
"Big data. Real -me.
The open big data serving engine:
Store, search, rank and organize big data
at user serving 8me."1
1
h$p://docs.vespa.ai/documenta5on/reference/na5verank.html
What does it do?
Use Vespa to build:
• Search applica,ons
• Personalized recommenda,on
• Naviga,on pages computed on demand
• Real,me data displays - tag clouds, maps, graphs
Configuring
Applica'on Packages
A Vespa applica+on package is the set of
configura+on files and Java plugins that
together define the behavior of a Vespa
system
Services.xml
Primary config file for an applica1on
package.
• <search> sets up the search endpoint
for Vespa queries. The default port is
8080.
• <nodes> defines the nodes required per
service. (See the reference for more on
container cluster setup.)
• <content> defines how documents are
stored and searched
Search Defini,on
Field Defini*ons:
• index: Create a search index for this
field
• a4ribute: Store this field in memory as
an a4ribute — for sor;ng, searching and
grouping
• summary: Let this field be part of the
document summary in the result set
Stopwords, Synonyms and Query Rewri4ng
[stopword] -> ; # (Replace them by nothing)
[stopword] :- and, or, the, be;
lotr -> lord of the rings;
[brand] -> company:[brand];
[brand] :- sony, dell, ibm, hp;
[category] +> $category:[category];
[category] :- laptop, digital camera, camera;
[destination] (in, by, at, on) [place] +> $name:[destination]
Linguis'cs
Default Linguis.cs
• Tokeniza*on on whitespace
• Kstemmer for stemming
• Changing linguis*cs means wri*ng code
• Only English for stemming, wai*ng for community support to
extend
See h%p://docs.vespa.ai/documenta5on/linguis5cs.html for more
informa5on.
Custom Linguis,cs
Start here: h)ps://github.com/vespa-
engine/vespa/tree/master/linguis9cs/src/
main/java/com/yahoo/language/simple
Ranking
First, Querying and a Li1le YQL
http://localhost:8080/search/
?yql=select * from sources * where userQuery()
&query=trees
Other YQL Examples
Numerics
select * from sources * where 500 >= price;
Grouping and aggregates
select * from sources * where sddocname contains 'purchase' |
all(group(customer) each(output(sum(price))));
Na#veRank
"Out of the box" ranking for Vespa combines1
:
• Field/A)ribute Match
• Proximity
Good for text ranking, but should be combined with other features
for even be9er relevancy.
1
h$p://docs.vespa.ai/documenta5on/reference/na5verank.html
Ranking Expressions
Built with query features:
nativeRank + query(deservesFreshness) * freshness(timestamp)
More Features
Feature Descrip,on
term(n).significance normalized number (between 0.0 and 1.0 describing the significance of the
term
term(n).connectedness normalized strength with which this term is connected to the previous term
queryTermCount number of terms in this query
fieldLength(name) number of terms in this field
fieldMatch(name) normalized measure of degree to which query and field matched
fieldMatch(name).queryCompleteness normalized raCo of query tokens matched in the field
fieldMatch(name).fieldCompleteness normalized raCo of query tokens which was matched
distanceToPath(name).distance euclidian distance from a path through 2d space
Full list: h*p://docs.vespa.ai/documenta6on/reference/rank-
features.html
Two Phase Ranking
search myapp {
…
rank-profile default inherits default {
first-phase {
expression: nativeRank + query(deservesFreshness) * freshness(timestamp)
}
second-phase {
expression {
0.7 * ( 0.7*fieldMatch(title) + 0.2*fieldMatch(description) + 0.1*fieldMatch(body) ) +
0.3 * attributeMatch(keywords)
}
rerank-count: 200
}
}
}
Side Note: Literal boos0ng
Vespa stems by default, but allows access to the literal value.
field title type string {
indexing: index
rank: literal
}
You can write this ranking expression:
0.9*fieldMatch(title) + 0.1*fieldMatch(title_literal)
Tensors
Mul$-dimensional arrays of values.
{
"user_id": 270,
"user_item_cf": {
"user_item_cf:0": -1.750116e-05,
"user_item_cf:1": 9.730623e-05,
"user_item_cf:2": 8.515047e-05,
"user_item_cf:3": 6.9297894e-05,
"user_item_cf:4": 7.343942e-05,
"user_item_cf:5": -0.00017635927,
"user_item_cf:6": 5.7642872e-05,
"user_item_cf:7": -6.6685796e-05,
"user_item_cf:8": 8.5506894e-05,
"user_item_cf:9": -1.7209566e-05
}
}
Searching With Tensors
rank-profile tensor {
first-phase {
expression: sum(query(user_item_cf) * attribute(user_item_cf))
}
}
Ranking with TensorFlow models
search tf {
document tf {
field document_tensor type tensor(d0[1],d1[784]) {
indexing: attribute | summary
attribute: tensor(d0[1],d1[784])
}
}
rank-profile default inherits default {
macro input_tensor() {
expression: attribute(document_tensor)
}
first-phase {
expression: sum(tensorflow("my_model/saved", "serving_default", "output"))
}
}
}
Try it
With Docker
git clone https://github.com/vespa-engine/sample-apps.git
export VESPA_SAMPLE_APPS=`pwd`/sample-apps
docker run --detach --name vespa --hostname vespa-container 
--privileged --volume $VESPA_SAMPLE_APPS:/vespa-sample-apps 
--publish 8080:8080 vespaengine/vespa
h"p://docs.vespa.ai/documenta3on/vespa-quick-start.html

More Related Content

What's hot

Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineLeveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineTrey Grainger
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systemsTrey Grainger
 
The Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation EnginesThe Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation EnginesTrey Grainger
 
Self-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrTrey Grainger
 
Haystack- Learning to rank in an hourly job market
Haystack- Learning to rank in an hourly job market Haystack- Learning to rank in an hourly job market
Haystack- Learning to rank in an hourly job market Xun Wang
 
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Trey Grainger
 
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval SystemsIntent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval SystemsTrey Grainger
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation EnginesTrey Grainger
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemTrey Grainger
 
Reflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemReflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemTrey Grainger
 
Building a Real-time Solr-powered Recommendation Engine
Building a Real-time Solr-powered Recommendation EngineBuilding a Real-time Solr-powered Recommendation Engine
Building a Real-time Solr-powered Recommendation Enginelucenerevolution
 
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...Lucidworks
 
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...Lucidworks
 
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...Lucidworks
 
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...Lucidworks
 
Thought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered SearchThought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered SearchTrey Grainger
 
Semantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrSemantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrTrey Grainger
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Lucidworks
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge GraphTrey Grainger
 

What's hot (20)

Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineLeveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systems
 
Extracting keywords from texts - Sanda Martincic Ipsic
Extracting keywords from texts - Sanda Martincic IpsicExtracting keywords from texts - Sanda Martincic Ipsic
Extracting keywords from texts - Sanda Martincic Ipsic
 
The Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation EnginesThe Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation Engines
 
Self-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache Solr
 
Haystack- Learning to rank in an hourly job market
Haystack- Learning to rank in an hourly job market Haystack- Learning to rank in an hourly job market
Haystack- Learning to rank in an hourly job market
 
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
 
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval SystemsIntent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation Engines
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data Ecosystem
 
Reflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemReflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data system
 
Building a Real-time Solr-powered Recommendation Engine
Building a Real-time Solr-powered Recommendation EngineBuilding a Real-time Solr-powered Recommendation Engine
Building a Real-time Solr-powered Recommendation Engine
 
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
 
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
 
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
 
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
 
Thought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered SearchThought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered Search
 
Semantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrSemantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/Solr
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
 

Similar to Vespa tour - an introduction to building search and recommendation applications

Productionizing your Streaming Jobs
Productionizing your Streaming JobsProductionizing your Streaming Jobs
Productionizing your Streaming JobsDatabricks
 
Tez Data Processing over Yarn
Tez Data Processing over YarnTez Data Processing over Yarn
Tez Data Processing over YarnInMobi Technology
 
Aprovisionamiento multi-proveedor con Terraform - Plain Concepts DevOps day
Aprovisionamiento multi-proveedor con Terraform  - Plain Concepts DevOps dayAprovisionamiento multi-proveedor con Terraform  - Plain Concepts DevOps day
Aprovisionamiento multi-proveedor con Terraform - Plain Concepts DevOps dayPlain Concepts
 
Annotating search results from web databases-IEEE Transaction Paper 2013
Annotating search results from web databases-IEEE Transaction Paper 2013Annotating search results from web databases-IEEE Transaction Paper 2013
Annotating search results from web databases-IEEE Transaction Paper 2013Yadhu Kiran
 
Finding the right stuff, an intro to Elasticsearch (at Rug::B)
Finding the right stuff, an intro to Elasticsearch (at Rug::B) Finding the right stuff, an intro to Elasticsearch (at Rug::B)
Finding the right stuff, an intro to Elasticsearch (at Rug::B) Michael Reinsch
 
StackMate - CloudFormation for CloudStack
StackMate - CloudFormation for CloudStackStackMate - CloudFormation for CloudStack
StackMate - CloudFormation for CloudStackChiradeep Vittal
 
Get started with R lang
Get started with R langGet started with R lang
Get started with R langsenthil0809
 
Data Quality, Correctness and Dynamic Transformations using Spark and Scala
Data Quality, Correctness and Dynamic Transformations using Spark and ScalaData Quality, Correctness and Dynamic Transformations using Spark and Scala
Data Quality, Correctness and Dynamic Transformations using Spark and ScalaSubhasish Guha
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearchMinsoo Jun
 
Spring 3 - Der dritte Frühling
Spring 3 - Der dritte FrühlingSpring 3 - Der dritte Frühling
Spring 3 - Der dritte FrühlingThorsten Kamann
 
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Julian Hyde
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Helena Edelson
 
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...Jürgen Ambrosi
 
AWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDBAWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDBAmazon Web Services
 
Embrace NoSQL and Eventual Consistency with Ripple
Embrace NoSQL and Eventual Consistency with RippleEmbrace NoSQL and Eventual Consistency with Ripple
Embrace NoSQL and Eventual Consistency with RippleSean Cribbs
 
ETL to ML: Use Apache Spark as an end to end tool for Advanced Analytics
ETL to ML: Use Apache Spark as an end to end tool for Advanced AnalyticsETL to ML: Use Apache Spark as an end to end tool for Advanced Analytics
ETL to ML: Use Apache Spark as an end to end tool for Advanced AnalyticsMiklos Christine
 

Similar to Vespa tour - an introduction to building search and recommendation applications (20)

Productionizing your Streaming Jobs
Productionizing your Streaming JobsProductionizing your Streaming Jobs
Productionizing your Streaming Jobs
 
Tez Data Processing over Yarn
Tez Data Processing over YarnTez Data Processing over Yarn
Tez Data Processing over Yarn
 
Aprovisionamiento multi-proveedor con Terraform - Plain Concepts DevOps day
Aprovisionamiento multi-proveedor con Terraform  - Plain Concepts DevOps dayAprovisionamiento multi-proveedor con Terraform  - Plain Concepts DevOps day
Aprovisionamiento multi-proveedor con Terraform - Plain Concepts DevOps day
 
Couchbas for dummies
Couchbas for dummiesCouchbas for dummies
Couchbas for dummies
 
Annotating search results from web databases-IEEE Transaction Paper 2013
Annotating search results from web databases-IEEE Transaction Paper 2013Annotating search results from web databases-IEEE Transaction Paper 2013
Annotating search results from web databases-IEEE Transaction Paper 2013
 
Finding the right stuff, an intro to Elasticsearch (at Rug::B)
Finding the right stuff, an intro to Elasticsearch (at Rug::B) Finding the right stuff, an intro to Elasticsearch (at Rug::B)
Finding the right stuff, an intro to Elasticsearch (at Rug::B)
 
StackMate - CloudFormation for CloudStack
StackMate - CloudFormation for CloudStackStackMate - CloudFormation for CloudStack
StackMate - CloudFormation for CloudStack
 
Get started with R lang
Get started with R langGet started with R lang
Get started with R lang
 
Data Quality, Correctness and Dynamic Transformations using Spark and Scala
Data Quality, Correctness and Dynamic Transformations using Spark and ScalaData Quality, Correctness and Dynamic Transformations using Spark and Scala
Data Quality, Correctness and Dynamic Transformations using Spark and Scala
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearch
 
Spring 3 - Der dritte Frühling
Spring 3 - Der dritte FrühlingSpring 3 - Der dritte Frühling
Spring 3 - Der dritte Frühling
 
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
 
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
 
AWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDBAWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDB
 
Embrace NoSQL and Eventual Consistency with Ripple
Embrace NoSQL and Eventual Consistency with RippleEmbrace NoSQL and Eventual Consistency with Ripple
Embrace NoSQL and Eventual Consistency with Ripple
 
Effective terraform
Effective terraformEffective terraform
Effective terraform
 
Caerusone
CaerusoneCaerusone
Caerusone
 
ETL to ML: Use Apache Spark as an end to end tool for Advanced Analytics
ETL to ML: Use Apache Spark as an end to end tool for Advanced AnalyticsETL to ML: Use Apache Spark as an end to end tool for Advanced Analytics
ETL to ML: Use Apache Spark as an end to end tool for Advanced Analytics
 
Apache phoenix
Apache phoenixApache phoenix
Apache phoenix
 

Recently uploaded

Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Recently uploaded (20)

Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Vespa tour - an introduction to building search and recommendation applications

  • 2. Me Ma# Overstreet OpenSource Connec2ons Stuff I do: * Solr/Elas1cSearch/Searchy-stuff * DataStax Cassandra * So;ware Development
  • 3. What is it? "Big data. Real -me. The open big data serving engine: Store, search, rank and organize big data at user serving 8me."1 1 h$p://docs.vespa.ai/documenta5on/reference/na5verank.html
  • 4. What does it do? Use Vespa to build: • Search applica,ons • Personalized recommenda,on • Naviga,on pages computed on demand • Real,me data displays - tag clouds, maps, graphs
  • 6. Applica'on Packages A Vespa applica+on package is the set of configura+on files and Java plugins that together define the behavior of a Vespa system
  • 7. Services.xml Primary config file for an applica1on package. • <search> sets up the search endpoint for Vespa queries. The default port is 8080. • <nodes> defines the nodes required per service. (See the reference for more on container cluster setup.) • <content> defines how documents are stored and searched
  • 8. Search Defini,on Field Defini*ons: • index: Create a search index for this field • a4ribute: Store this field in memory as an a4ribute — for sor;ng, searching and grouping • summary: Let this field be part of the document summary in the result set
  • 9. Stopwords, Synonyms and Query Rewri4ng [stopword] -> ; # (Replace them by nothing) [stopword] :- and, or, the, be; lotr -> lord of the rings; [brand] -> company:[brand]; [brand] :- sony, dell, ibm, hp; [category] +> $category:[category]; [category] :- laptop, digital camera, camera; [destination] (in, by, at, on) [place] +> $name:[destination]
  • 11. Default Linguis.cs • Tokeniza*on on whitespace • Kstemmer for stemming • Changing linguis*cs means wri*ng code • Only English for stemming, wai*ng for community support to extend See h%p://docs.vespa.ai/documenta5on/linguis5cs.html for more informa5on.
  • 12. Custom Linguis,cs Start here: h)ps://github.com/vespa- engine/vespa/tree/master/linguis9cs/src/ main/java/com/yahoo/language/simple
  • 14. First, Querying and a Li1le YQL http://localhost:8080/search/ ?yql=select * from sources * where userQuery() &query=trees
  • 15. Other YQL Examples Numerics select * from sources * where 500 >= price; Grouping and aggregates select * from sources * where sddocname contains 'purchase' | all(group(customer) each(output(sum(price))));
  • 16. Na#veRank "Out of the box" ranking for Vespa combines1 : • Field/A)ribute Match • Proximity Good for text ranking, but should be combined with other features for even be9er relevancy. 1 h$p://docs.vespa.ai/documenta5on/reference/na5verank.html
  • 17. Ranking Expressions Built with query features: nativeRank + query(deservesFreshness) * freshness(timestamp)
  • 18. More Features Feature Descrip,on term(n).significance normalized number (between 0.0 and 1.0 describing the significance of the term term(n).connectedness normalized strength with which this term is connected to the previous term queryTermCount number of terms in this query fieldLength(name) number of terms in this field fieldMatch(name) normalized measure of degree to which query and field matched fieldMatch(name).queryCompleteness normalized raCo of query tokens matched in the field fieldMatch(name).fieldCompleteness normalized raCo of query tokens which was matched distanceToPath(name).distance euclidian distance from a path through 2d space Full list: h*p://docs.vespa.ai/documenta6on/reference/rank- features.html
  • 19. Two Phase Ranking search myapp { … rank-profile default inherits default { first-phase { expression: nativeRank + query(deservesFreshness) * freshness(timestamp) } second-phase { expression { 0.7 * ( 0.7*fieldMatch(title) + 0.2*fieldMatch(description) + 0.1*fieldMatch(body) ) + 0.3 * attributeMatch(keywords) } rerank-count: 200 } } }
  • 20. Side Note: Literal boos0ng Vespa stems by default, but allows access to the literal value. field title type string { indexing: index rank: literal } You can write this ranking expression: 0.9*fieldMatch(title) + 0.1*fieldMatch(title_literal)
  • 21. Tensors Mul$-dimensional arrays of values. { "user_id": 270, "user_item_cf": { "user_item_cf:0": -1.750116e-05, "user_item_cf:1": 9.730623e-05, "user_item_cf:2": 8.515047e-05, "user_item_cf:3": 6.9297894e-05, "user_item_cf:4": 7.343942e-05, "user_item_cf:5": -0.00017635927, "user_item_cf:6": 5.7642872e-05, "user_item_cf:7": -6.6685796e-05, "user_item_cf:8": 8.5506894e-05, "user_item_cf:9": -1.7209566e-05 } }
  • 22. Searching With Tensors rank-profile tensor { first-phase { expression: sum(query(user_item_cf) * attribute(user_item_cf)) } }
  • 23. Ranking with TensorFlow models search tf { document tf { field document_tensor type tensor(d0[1],d1[784]) { indexing: attribute | summary attribute: tensor(d0[1],d1[784]) } } rank-profile default inherits default { macro input_tensor() { expression: attribute(document_tensor) } first-phase { expression: sum(tensorflow("my_model/saved", "serving_default", "output")) } } }
  • 25. With Docker git clone https://github.com/vespa-engine/sample-apps.git export VESPA_SAMPLE_APPS=`pwd`/sample-apps docker run --detach --name vespa --hostname vespa-container --privileged --volume $VESPA_SAMPLE_APPS:/vespa-sample-apps --publish 8080:8080 vespaengine/vespa h"p://docs.vespa.ai/documenta3on/vespa-quick-start.html