2. Customizing Ranking Models for Enterprise Search
Ammar Haris Joe Zeimen
Lead Software Engineer, Salesforce Senior Software Engineer, Salesforce
3. Forward-Looking Statements
Statement under the Private Securities Litigation Reform Act of 1995:
This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties
materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or
implied by the forward-looking statements we make. All statements other than statements of historical fact could be deemed forward-looking,
including any projections of product or service availability, subscriber growth, earnings, revenues, or other financial items and any statements
regarding strategies or plans of management for future operations, statements of belief, any statements concerning new, planned, or upgraded
services or technology developments and customer contracts or use of our services.
The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new functionality
for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our operating results and
rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of any litigation, risks associated with
completed and any possible mergers and acquisitions, the immature market in which we operate, our relatively limited operating history, our
ability to expand, retain, and motivate our employees and manage our growth, new releases of our service and successful customer
deployment, our limited history reselling non-salesforce.com products, and utilization and selling to larger enterprise customers. Further
information on potential factors that could affect the financial results of salesforce.com, inc. is included in our annual report on Form 10-K for the
most recent fiscal year and in our quarterly report on Form 10-Q for the most recent fiscal quarter. These documents and others containing
important disclosures are available on the SEC Filings section of the Investor Information section of our Web site.
Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently available
and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that
are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements.
4. Outline
● Overview of Search @ Salesforce
● Relevance for Enterprise Search
● Executing custom machine-learned models in Solr
○ Using Function Queries
○ Leveraging SearchComponent
8. Search @ Salesforce
● Most used feature of Salesforce
● 450 Billion documents
● 90 Million queries per day
● Multiple entry points
○ Web/Mobile
○ Salesforce Object Search Language (SOSL) API
9. How We Index 450 Billion Documents @ Salesforce
App Tier
Message
Queue
Cron Job
Solr Tier
Solr Cores
5. EnqueueLast
unindexed
Entity Table
3. Trigger
Index
Metadata
1. Create/Update
2. SQL
4. Polling
6. Lookup
Core
7. Fetch Data
8. Send for
Indexing
10. Querying @ Salesforce (90 Million per day)
Query Front End
Index
Metadata
QueryingService Database
Solr Tier
Solr Cores
11. Querying @ Salesforce (90 Million per day)
Query Front End
Index
Metadata
QueryingService Database
Solr Tier
Solr Cores
12. Querying @ Salesforce (90 Million per day)
Query Front End
Records
QueryingService Database
Solr Tier
Solr Cores
Access
Checks
13. Relevance for Enterprise Search
● A single search engine needs to cater for multiple customersorganizations.
● Many different types of structured and unstructured documents may exist
for a single organization
● Ranking Models - one size fit all may not work very well across different
organizationsdocument types.
Challenges:
14. Relevance @ Salesforce
3 Tier Search Relevance
● L0 - Preliminary ranking of documents matched against a given search query based on
similarity score.
● L1 - Additional document and user level attributes used to further refine the ranked
documents
● L2 - Final level of document aggregation and re-rankingsorting
15. Relevance @ Salesforce
3 Tier Search Relevance
● L0 - Solr Level Relevance (Primarily based on TF-IDF and some field level boosts). Does not
have access to query independent document level features
● L1 - Application Level Relevance - Static RankQuery Independent document scoring and
re-ranking on top 250 documents only due to performance constraints.
● L2 - Database Level Relevance; re-ranking of top 25 documents based on features available
only during final DB query for user access checks.
16. Moving Search Relevance to Solr
Intent
● Have all the 3 tiers of relevance co-hosted and abstracted out of the application tier.
Motivation
● Have the static rank applied to a wider set of documents versus a limited set of documents
● Creates the ability to run more complex models
● Provides additional flexibility to the multi-layered machine learning ranking framework.
17. Relevance @ Salesforce - Original Architecture
Search ServerApp Server
DB Index
L0 Ranker
(tf, idf, coefs, field
boost)L1 Ranker
(features:
popularity, inbound
links)
L2 Ranker
(Result aggregation
+ re-ranking)
Config
(coefs) Query, coefs
Id, score
Query
Id, score
18. Relevance @ Salesforce - Original Architecture
Search ServerApp Server
DB Index
L0 Ranker
(tf, idf, coefs, field
boosts)
L1 Ranker
(features,
freshness,
popularity))
L2 Ranker
(Result
aggregation)
Config
(coefs) Query, coefs
Id, score
Query
Id, score
ID,score,
features
19. L2 Ranker
The document aggregator lives in the app.
In the application tier, results from multiple solr cores are merged together
Normalize the scores over the maximum score and re-rank cross core documents based on the
final solr score
01
0
12
20. Basic tf-idf Similarity Score
● Leverages relevance related features provided by solr out of the box:
○ Boost specific fields of the document, if matched.
■ TitleName field, document owner id field
○ De-boost documents on specific fields
■ Is record inactive
○ Use function query to apply custom linear functions on select features.
■ product(8.429,floor(div(max(0,log(floor(product(pow(0.98,div(sub(ms(),feature.pageViewsLast
Updated),84600000)),feature.pageViews)))),log(2.718))))
L0 Ranker
01
0
12
21. ● Re-ranker - allows for running of more complexexpensive models on a
subset of matched documents
● Consumes features from stored fields and docvalue fields
● Enables usage of same feature vector across different ranking layers
● Features consumed and cached during L0 ranker execution may be used in
the L1 ranker as well
● Allows to easily plug in different kind of relevance models (boosted decision
trees, polynomials, etc)
L1 Ranker (DeepRanker)
01
0
12
22. Basic with Mountain Footer
Text boxes default to 20 points and without bullets on the slide and are darker gray
To create second level paragraph change text manually to 18 points and select theme color gray
Use a soft return for creating the next paragraph (shift + enter will limit the spacing size)
To create second level paragraph change text manually to 18 points and select theme color gray
24 point subtitle
23. Model Execution/Deep Ranker (Solr SearchComponent)
Model A
run()
Feature
Extractor
Model B
run()
Feature
Extractor
Model C
run()
Feature
Extractor
DeepRanker extends
SearchComponent
ResponseBuilder
With model id in query
params
Scored Results
24. Online (Query Time) Feature Extraction
24 point subtitle
Once the user query is received and parsed, the feature vector is extracted,
which will feed into the model.
Typical ranker features include Tf score, Idf score, phrase match score,
document recency, popularity score etc.
Feature extraction is triggered from a custom Solr SearchComponent and loads
up all the document and query related features into a feature vector.
25. Once the features are extracted and loaded into a feature vector, they are passed over to the
model executor.
public abstract class Model {
public abstract double run(@NonNull final FeatureContainer features) throws Exception;
public FeatureExtractor getFeatureExtractorForModel() {
return new FieldLevelExtractorImpl(false);
}
}
Model may be customized at a per organization/object type level
After model execution the document ids and their scores are passed to the Query Front End.
Model Execution
27. DeepRanker as Solr SearchComponent
Last search component to run in the solr search query pipeline
Doc Values vs Stored Values
Stored Values more suited for reading multiple values.
Doc Values ideal for storing per document values with support for primitive types (int, long)
Number of documents a model can run on
Feature extraction/Model Execution timeouts
Design Considerations and Limitations
28. Figure out how to put deep ranker near the beginning of the pipeline to run on the full corpus
(integrating the deep ranker models with Similarity class).
Move L2 Ranker (document aggregator) out of the application tier into a separate aggregation
service and add social signals to this ranker.
Ingest additional signals in the solr rankers which may not be part of the search index
Future Plans for Deep Ranker
29. Search at Salesforce
Enterprise search must cater to a variety of use cases and types of data
Deep Ranker Solr offers solutions
Easily and dynamically use different models for different situations
Run on more results than previously possible
Use same features across other ranking layers
Summary
30. We are Hiring !
ML Engineers
Engineering
Managers
Software
EngineersData Scientists Join Salesforce Search Cloud
Mining Intent @ Work
31. Dreamforce 2016 Aligned
Salesforce Google Slides Template
Template, graphic resources and how-tos
This online template was developed for internal Salesforce meetings.
It is lower resolution, streamlined and does not merge with PowerPoint
well. For offline or external audiences, please use the official
Corporate PowerPoint Template as your foundation.
This template is maintained by the Corporate Messaging & Content team.
Please send any questions to our Chatter Group.
32. Make a Copy of this Presentation Before You Start
Copy a version of this template into your
Google Drives to begin working in it. This
will not effect the master file shown here.
Refer back to this template for the most
current version with updated, assets,
examples and how-tos.
33. Google Slides and PowerPoint Are Worlds Apart
This template the Corporate PowerPoint Template
were created for different use cases despite looking similar.
Please Don’t Download Google Slides As A PowerPoint Presentation
Use the Corporate PowerPoint Template as the basis for your offline presentations.
Best Practice for Adding Slides From One Deck into the Other Format
If for example, you need a slide from Google Slides placed into a PowerPoint deck
(or vice versa) we recommend copying the content itself off the original slide and
pasting that into the new slide. Avoid copying over entire slides. (Start by copying
titles, then go back and copy images and other content, then delete the source slide.)
Following these rules will ensure the highest quality and
prevent problems when displaying, updating or sharing
files with others.
Google Slides
Internal facing online
presentations only
PowerPoint Template
External facing presentations
Corporate Presentation
This deck is built on the
Corporate Template. Use this
for external facing
presentations. This will be
updated after dreamforce.
34. 1. Select the slide you
would like to change
How To Change a Slide to a Different Layout in
Google Slides
2. Right click on the slide
and Select Apply Layout
3. Choose the layout you
would like to change it to
35. Google Slides Template
Available Slide Layouts
There are over 25 pre made layouts built into this template. Formal and creative options available.
Custom Sample Slides
Additional custom slides that capture the current look and feel
Graphic Assets Tool Kit
Tools and resources that can be used to add texture, character and retain consistency across your deck
How Tos & Shortcuts
Steps to increase speed in production and improve the overall quality
1
2
3
4
37. Slide Layout A
Subtitle placeholder
First Name Last Name
Title of Presenter
email@salesforce.com
@twitterhandle
38. Title Slide Layout B
Subtitle placeholder
First Name Last Name
Title of Presenter
email@salesforce.com
@twitterhandle
39. Title Slide Layout C
Subtitle placeholder
First Name Last Name
Title of Presenter
email@salesforce.com
@twitterhandle
40. Basic
Text boxes default to 20 points and without bullets on the slide and are darker gray
To create second level paragraph change text manually to 18 points and select theme color gray
Use a soft return for creating the next paragraph (shift + enter will limit the spacing size)
To create second level paragraph change text manually to 18 points and select theme color gray
24 point subtitle
46. Third Split Layout 2
Text, images, charts,
tables can be put in this
placeholder.
Subtitle placeholder
47. Subtitle placeholder
Product Placement Slide
Lorem ipsum dolor sit amet, consectetur
Cras egestas mauris ut faucibus cursus
Pellentesque et risus ac turpis maximus
48. Crop Your Image To
this Space For A Photo slide
Or place a shape the color of
your Product Cloud
Subtitle placeholder
Photo Content Layout
Lorem ipsum dolor sit amet, consectetur
Cras egestas mauris ut faucibus cursus
Pellentesque et risus ac turpis maximus
54. Basic Dark Layout
Only use this layout for important callout slides
Text boxes default to 20 points and without bullets on the slide and are darker gray
To create second level paragraph change text manually to 16 points and select lightest white
Use a soft return for creating the next paragraph (shift + enter will limit the spacing size)
To create second level paragraph change text manually to 16 points and select lightest white
56. “The Service Cloud is the front
door of the house.”
Service Cloud Implementation Lead, Intuit
Intuit Replaces Siebel with Service Cloud to
Increase Agility
Integrated multiple systems into a single sign-on solution
Streamlined paperless end-to-end process
Managing all client relations with Salesforce
Access to full workstation on mobile devices
57. “Customer service with
Desk.com is out secret recipe.”
Conrad Chu, CTO & Co-founder
Munchery Delivers Faster
with Desk.com
Up and running on Desk.com in one hour
Leveraging customer data to improve business
Central hub for all customer support channels
330 resolved cases per day
59. Revving up a Startup with a Deluxe Service Experience
On-demand parking app with 40% MoM growth
Deployed Desk.com in one day
Integration with Slack, Teckst, and homegrown CRM
40%
decrease in first
response time
63. New Empowerment Model
Anyone can be a Customer Trailblazer
Transform your company
Innovate with Salesforce
Grow your career
Be your best
You can…
salary premium with
Salesforce Certification
$
20K
Tami Lau
CRM Developer
64. Be a Customer Trailblazer
Connect to your customers in a whole new way
8 industry leading apps, 1 platform
Tami Lau
CRM Developer
65. Analytics Cloud
Get smarter about your customers
Connect all your customer data
Wave Platform
Get answers, faster
Sales Wave & Service Wave Apps
Take action, instantly
Wave Actions in Salesforce
Make decisions from anywhere
Wave Mobile
faster decision making
48%
66. Chapter 1 Chapter 2 Chapter 3
Place Image Here
Place Image Here Place Image Here
67. Chapter 1 Chapter 2 Chapter 3
Place Image Here
Place Image Here Place Image Here
68. Chapter 1 Chapter 2 Chapter 3
Place Image Here
Place Image Here Place Image Here
69. Example of a Table
Column title Column title Column title
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
70. Example of a Table
Column title Column title Column title
Row title Lorem ipsum dolor sit amet,
consectetur adipiscing elit.
Suspendisse congue turpis
maximus dignissim posuere.
Quisque sit amet justo ultrices,
finibus massa eu, vehicula dui.
Lorem ipsum dolor sit amet,
consectetur adipiscing elit.
Suspendisse congue turpis
maximus dignissim posuere.
Quisque sit amet justo ultrices,
finibus massa eu, vehicula dui.
Lorem ipsum dolor sit amet,
consectetur adipiscing elit.
Suspendisse congue turpis
maximus dignissim posuere.
Quisque sit amet justo ultrices,
finibus massa eu, vehicula dui.
Row title Lorem ipsum dolor sit amet,
consectetur adipiscing elit.
Suspendisse congue turpis
maximus dignissim posuere.
Quisque sit amet justo ultrices,
finibus massa eu, vehicula dui.
Lorem ipsum dolor sit amet,
consectetur adipiscing elit.
Suspendisse congue turpis
maximus dignissim posuere.
Quisque sit amet justo ultrices,
finibus massa eu, vehicula dui.
Lorem ipsum dolor sit amet,
consectetur adipiscing elit.
Suspendisse congue turpis
maximus dignissim posuere.
Quisque sit amet justo ultrices,
finibus massa eu, vehicula dui.
71. Example of a Table
Column title Column title Column title
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
Row title $00.00 $00.00 $00.00
75. Text Font Size and Colors
Subtitle: 24 Points
Heading 1: 20 Points
Heading 2: 18 points
Heading 3: 14 points
76. Slide Graphics
Use these styles consistently
throughout so the visuals effectively
support your presentation.
Shapes and gradients
In general, the shapes should be flat and
colored with slightly rounded corners.
Arrows
Use above arrow
head for visual
consistency.
Diagram Arrows
Use this style within movement
diagrams: line and circle color can
change with diagram use.
Lines
Branded Lines are set at 3 pt thickness.
Standard lines are 1 pt thick
Call Out Box Style
Call out boxes are a great way to highlight a piece
of your layout using text. Hold the shift key when
resizing to ensure aspect ratios stay the same
Highlight Color
77. Slide Graphics
Text
Learn more about
Lightning!
Visit the Lightning Experience
Theater in the Campground
NEW
more time with your
customers
25%
more time with your
customers
25%
faster decision making
+
48%For longer
text
25% 25%
Text here
Text here
25%
25%
86. Best Practice for Importing Slides
If you import slides from an older deck into this Google Slides template the content will not
link and align perfectly, even with the appropriate layout selected.
To ensure slides are consistent within the template and all slides use the same spacing and alignment, it is
recommended that you create a blank new slide after you copy in a slide from one deck to the other. Once
you have these side by side retype or paste the title and subtitle into your final presentation.
Then go back, copy and paste the remaining content and graphics into the slide directly. Once you
have recreated the slide in your deck, you can delete the original.
Note: Even if an imported slide looks similar to the template, double check that the titles align.
(The only way to ensure that a slide is actually correct is to rebuild the slide starting with a blank layout.)
87. Five Alternatives to Bulleted Lists
Salesforce tries to avoid using bulleted lists when possible
1. Paragraph Line Spacing
Motivate to get things done
Inspire by tracking goals
Score to ensure right priority
Motivate to get things done
Inspire by tracking goals
Score to ensure right priority
2. Bold first word
(note wording must be carefully constructed)
Motivate to get things done
Inspire by tracking goals
Score to ensure right priority
3. Columns
(note wording must be carefully constructed)
Inspire by
tracking goals
Score to ensure
right priority
Motivate to get
things done
4. Paragraph Heading or Word Heading
5. Graphic instead of Bullet
Motivate
to get things done
Inspire
by tracking goals
Score
to ensure right priority
Motivate to get things done
Inspire by tracking goals
Score to ensure right priority
88. How To Apply a Google Theme (Template)
to An Existing Presentation
You can apply a Template Theme to an
existing deck. Note that it is not a 1 to 1
translation so many things will be off.
(Especially subtitles, left margin, font
colors, etc. )
We recommend you rebuild a deck into
the new template one slide at a time to
ensure accuracy.
If you do Import the Theme, you still will
need to through each slide to ensure
layout is consistent with the Google
Slides Corporate Template.
Refer to the steps on right
1
2
3
1. Slide Tab
2. Change
Theme
Import theme
Select
original
template
Go through each
slide carefully.
4
1
2
3
4
89. All Text boxes default to 20 points without bullets
on the slide and default to the darker gray
To create second level paragraph, change text manually
to 18 points and select theme color gray
Use a soft return for creating the next paragraph
(shift + enter will limit the spacing size)
You can press a hard return to create larger spaces between
your paragraphs or go to: Line spacing>Custom Spacing and
set them manually
All layouts have a place for a subtitle and should always be sentence case
Working with Text and Content
90. ● Manually add a bullet by selecting bullet icon on toolbar and pressing return
○ Once you have added bullets you can press the indent icon button
for the second level bullet
Working with Bullets on the Slide
How to add bullets to a basic layout
NOTE: If you don’t want both levels to have a bullet, you
can delete a bullet by pressing backspace until it is gone.
Step 1
Step 2
91. Keyboard Shortcuts
Objects
Move Object One Pixel: Shift + Arrow Keys (PC: Cntrl + Arrow Keys)
Group: Cmnd + G (PC: Cntrl + G)
Ungroup: Cmnd + Shift + G (PC: Cntrl + Shift + G)
Send Backwards: Cmnd +
Bring Forwards: Cmnd +
Textbox into Shape: Esc (Once you press escape you can move a text boxs with the arrow keys)
Text
Soft return: Shift + Enter
Bold Text: Cmnd + B (PC: Cntrl + B)
Align Left: Cmnd + Shift + L (PC: Cntrl + Shift + L)
Center Align: Cmnd + Shift + E (PC: Cntrl + Shift + E)
Right Align: Cmnd + Shift + R (PC: Cntrl + Shift + R)
Repeat Last Action: Cmnd + Y (PC: Cntrl + Y)
Paste Unformatted Text: Cmnd + Shft + V (PC: Cntrl + Shft + V)
Slides
New Slide: Cntrl + M
92. Drawing Guides Alignment Tool (Margins)
The left and right top and bottom
corners only area you should work
within on your slides.Go into Master Layouts Copy and Paste These Lines
into your Working Slides to Create Guides
Since Google slides does not allow you add Drawing Guides, you can
go into this Master slide, copy these orange lines, paste them into the
slide you are working on, align content and delete.