Alexis R

Using AI To Provide Insights And
Recommendations From Activity Data
Alexis Roos
Director of Machine Learning
@alexisroos

Agenda
• Introduction
• Activity & AI
• Building Classifiers
• Generating Insights
• Demo
• Relationship Insights
• Wrap up and QAs

This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the
assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we
make. All statements other than statements of historical fact could be deemed forward-looking, including any projections of product or service availability, subscriber
growth, earnings, revenues, or other financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any
statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services.
The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new functionality for our service, new
products and services, our new business model, our past operating losses, possible fluctuations in our operating results and rate of growth, interruptions or delays in
our Web hosting, breach of our security measures, the outcome of any litigation, risks associated with completed and any possible mergers and acquisitions, the
immature market in which we operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new
releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to larger enterprise
customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is included in our annual report on Form 10-K for the
most recent fiscal year and in our quarterly report on Form 10-Q for the most recent fiscal quarter. These documents and others containing important disclosures are
available on the SEC Filings section of the Investor Information section of our Web site.
Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently available and may not be
delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available.
Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements.
Statement under the Private Securities Litigation Reform Act of 1995
Forward-Looking Statement

Doing Well and Doing Good
#1 World’s Most
Innovative Companies
Best Places to Work
for LGBTQ Equality
#1 The World’s Best
Workplaces
#1 Workplace for
Giving Back
#1 Top 50 Companies
that Care
The World’s Most
Innovative Companies
#1 The Future 50

Salesforce Keeps Getting Smarter with Einstein
Guide Marketers
Einstein Engagement Scoring
Einstein Segmentation (pilot)
Einstein Vision for Social
Assist Service Agents
Einstein Bots (pilot)
Einstein Agent (pilot)
Einstein Vision for Field Service (pilot)
Coach Sales Reps
Einstein Forecasting (pilot)
Einstein Lead & Opportunity Scoring
Einstein Activity Capture
Advise Retailers
Einstein Product Recommendations
Einstein Search Dictionaries
Einstein Predictive Sort
Empower Admins & Developers
Einstein Prediction Builder (pilot)
Einstein Vision & Language
Einstein Discovery
Help Community Members
Einstein Answers (pilot)
Community Sentiment (pilot)
Einstein Recommendations
Austin Buchan
CEO, College Forward

Activities are the R in CRM
• Timestamped data
Emails, meetings, tasks, phone calls, etc
• User centric
User is the initiator or owner
• Defines relationship
Who is connected to whom, frequency of touchpoints, reciprocity, meetings
• High volume, high potential, extremely rich
Lots of traffic, content contains lots of information and signals
• Historical context is important
Length of relationship, how the relationship evolves over time
• Can be used for many different use cases
Email Insights, timelines, opportunity scoring, search, etc

• Large scale distributed real time streaming platforms are hard!
Unique use case, multiple products and services
• Large volume of activities
Tens of thousands of orgs connected through Inbox and Einstein Activity Capture
• Automatic capture really important
Maintains high fidelity
• Generate accurate intelligence
Some events are rare
• Speed
Emails must be processed within seconds
• Data privacy and security
Security, auditable access, data retention, privacy, GDPR, etc
Platform challenges

Augment CRM experience using AI and activity
Suggest
Action(s)
Email Insights:
Pricing discussed, Executive
involved, Scheduling Requested, etc.
AI Inbox
Timelines
Other Salesforce
Apps
…
Automatic
activity
capture
Extract
Insights
Emails,
meetings,
tasks,
calls, etc
Generate
Context
Reply with price list
Insert free time
Involve Executive
etc
Contextual services:
Recommended connections, Best time
to email, Suggested recipients, etc.

● Right language
● Automated vs non automated
● Inbound / outbound
● Within or outside the organization
● etc
Challenge 1: get to relevant emails

Challenge 2: structure of an email
INTRO
SIGNATURE
CONFIDENTIALITY NOTICE
REPLY CHAIN
BODYBODY
Hey Alexis,
Let’s meet with Ascander on Friday to discuss
the $10,000/year rate. Ascander’s phone
number is (123) 456-7890.
Thanks,
Noah Bergman
Engineer at Salesforce
(123) 456-7890
The contents of this email and any attachments
are confidential and are intended solely for
addressee…
From: Alexis alexis@salesforce.com
Date: April 1, 2017
Subject: Important Document
Noah, how much does your product cost?
HEADER INFORMATION ...

How can we get a higher yield of positive labels when labeling by hand?
-> Use filters + Word2Vec
. Train Word2Vec on unlabeled email
. Find words close in distance to “price”, “cost”,
“license”, etc
• No labels, and
currently no
mechanism to
infer labels
• Pricing
discussions are
important, but
relatively rare
events
Challenge 3: many insights require labeling data

Data labeling pipeline
Word2Vec
GraphX
Filtering /
Sampling
Labeling tool
Emails
Labeled
Training
Data

Word Embedding (e.g., Word2Vec) for Feature Generation
● Word embeddings for individual tokens
capture the semantic.
● Aggregating word embeddings provides
powerful vectorized representation for a
body of text (e.g., email).
● Aggregated word embeddings are
incorporated as part of the feature vector
used in machine learning model training.
Unsupervised Learning for Better Representation of Text
Word Vectors Machine Learning
Models

Latent Dirichlet Allocation (LDA)
A document is a probability distribution over topics
Boeing: mixture of topics 4 and 5
Air Force One: mixture of topics 1 and 5
https://databricks.com/blog/2015/09/22/large-scale-topic-modeling-improvements-to-lda-on-apache-spark.html
-> Use entire topic distribution in the feature vector

Generating feature vectors and model training
Feature
Engineering
Model
Training
Model
Evaluation
LDA
Text
Processing
/ TF-IDF
Labeled
Training
Data
Emails
Word2Vec

Scoring pipeline
Email
Stream
Feature
Vector
Generation
Filtering, validation
and
normalization
Scoring
LDA
n-grams
TF-IDF
Word
vectors
. Filters: human,
inbound, in-org, etc
. Validation: well
formed, language, etc
. Normalization: email
signature and
confidentiality
parsing,
Tokenization,
Email, url, phone,
website, etc
normalizers
Language, etc

Feeding Deep Learning Model with Word Embedding
Training Recurrent Neural Networks with LSTM for State-of-the-Art
● LSTM networks are capable of capturing subtleties in natural human language.
● Using pre-trained word embeddings reduces demand for large quantity of labeled data.
● The combination opens up possibilities of advanced intelligence beyond classification.

How We Generate Insights from Activity
● Each classifier runs and extracts metadata
○ Scheduling: the date you want the meeting
○ Out of Office: the person’s return date
○ Executive Involved: the name and title of the exec
● Assign Actions based on which classifier is true
○ Create event on <Date>
○ Reply with Template
○ View Profile for <Exec>
● Classifier + Extracted Metadata + Actions = Insight
○ Classifier: “Scheduling Requested”
○ Extracted Metadata: “next Tuesday”
○ Actions: Create Event, Send Times, View Calendar
● Do it all in < 2 seconds

Collect and Filter
● Gather as much activity as possible
● Filter out spam, marketing e-mails, etc.
Collect Filter

Score and Extract
● The Spark Structured Streaming “Scoring Pipeline” portion you saw earlier
● Identifies which E-mails contain important moments
● Extracts the relevant metadata
Filtered
Email
Stream
Scheduling
Requested
Pricing
Discussed
etc...
SCHED_REQ
date: 2018-04-04

Make Insights Actionable
● Each Insight Type has actions or “next steps”
● Actions consume the extracted metadata and make it readily available for clients
SCHED_REQ
date: 2018-04-04
Insight
Publisher
(Context-Free)
Scheduling
Requested
● Create Event
● Send Times
● View Calendar

AI & Context
What do all those apps have in common? User context
Data + Algorithms + Compute = Killer Apps
Google

Consumer vs Enterprise Context
User isn’t the product but the customer
• Retention, privacy, GDPR, security, auditing, etc
Context has to be scoped
• Cannot be used globally: organization, team, user levels
Very rich
• Goes way beyond user context: organizations/groups/teams, products and services, companies,
different types of activities across many different products, etc
Very dynamic
• Fast coming data with lots of interaction points

Context enables us to deliver deeper insights.
Go beyond using a single email to make classification and action recommendation
This sender looks familiar, how well should I know him / her?
• Are we strongly connected? Is he or she important to my accounts or opportunities? etc
Is this email discussing products or services that my company sell?
Is this email discussing competitors?
Who, in my org, can help me sell to an individual or company?
• Supply relevant background information on a particular individual or company
• Identify who is the key decision maker
• Give me historical information for that individual or company
• Make an introduction for me
etc

A graph is an efficient means for encoding relationships.
An org can have thousands of contacts
• These contacts exist within the org itself (e.g.,
sales rep, account exec)
• Perhaps more importantly, contacts extend
beyond the org (e.g., buyers)
That same org can have millions of
events per week
• Events (e.g., meetings, emails, phone calls)
connect contacts and indicate a relationship
• The number and nature of events between
contacts can indicate strength of connection /
relationship
15 Jan Email - Sylvia to Andrea: introduction
20 Jan Meeting - Created by Andrea with Sylvia
31 Jan Email - Andrea to Sylvia & Mark: info request
01 Feb Email - Sylvia to Andrea & Mark: product info
04 Feb Email - Andrea to Sylvia & Joe
17 Feb Meeting created by Andrea with Alex and Joe
…
Andrea
Buyer
Mark
Evaluator
Alex
Sponsor
Joe
Acct mngr
Sylvia
Sales

T
Name: Joe Roos
Email: joe@salesforceuser.com
Title: Account manager
Company: Salesforce user
Recommended Connections:
{(Joe, 7.1), (Sylvia, 5), (…)}
Recommended Connections
Problem: John, a salesperson, needs more
information to polish his strategy with Andrea, an
important lead.
To whom should he turn?
Solution: recommended connections uses the
contact graph to identify Joe as the best person to
turn to for that information. Joe already knows
Andrea and has shared connections.
John
Sales Rep
Andrea
Buyer
Mark
Evaluator
Alex
Sponsor
Joe
Acct
mngr
Sylvia
Sales

Coupled with AI models, our graph delivers Contextual Services.
ContextGraph Models
• Pricing discussed
• Scheduling requested
• Exec involved
• etc.
• Identify hot leads
• Best time to email
• Recommend connections
• Updated contact info notification
• Suggest recipients, or rooms, for meetings
• Identify contact’s role: economic buyer, evaluator, influencer, etc.
• Relationship with contact: e.g., strength of connection, communication topics
Who is a particular email from and why should I care?
Role, latest communication, meeting history, mutual
friends, contact info, etc.
T
B
CD
E
A
U

Graph and ML/Deep Learning
Context free insights
• Aka pricing discussed vs pricing request: does not require context of products or services
• Allows composition: pricing discussed can be combined with or product mention or contact strength
AI insights enrich the graph
• Feedback
DL on graph
• Ex: DeepWalk to find who is influencer, colleagues, similar profiles, etc
DL on Graph: convnets, RNNs
• Challenges: non Euclidean data, invariance (node ordering), non fixed (dynamic), directed, etc
Still an area of research

Store
REST interface
clients
Activity
Stream
Activity
store
S3
bootstrap
persist/load
Processing in Spark Delivery of Graph Services
SAS
Overview of graph architecture

Take aways
• Activity data can tremendously reinforce Salesforce applications
• Context changes the meaning of data
• Streaming, batch, ML and Graph are complimentary
• Privacy has a wide impact
• Large scale activity processing and near real time

salesforce.com/careers
Alexis Roos
Director of Machine Learning
@alexisroos

Alexis R

Recommended

Recommended

More Related Content

What's hot

What's hot (9)

Similar to Alexis R

Similar to Alexis R (20)

More from Hilary Ip

More from Hilary Ip (20)

Recently uploaded

Recently uploaded (20)

Alexis R

Editor's Notes