SlideShare a Scribd company logo
1 of 42
Download to read offline
Fast Parallel Similarity Calculations
with FPGA Hardware
Dan McCreary, Distinguished Engineer, Optum
Kumar Deepak, Distinguished Engineer, Xilinx
Graph + AI World 2020
September 29, 2020
 
Talk Description
The foundation of recommendation is finding similar customers and their
purchasing patterns. Yet, if you have 100 million customers it can take
hours to do similarity calculations on just 200 features. However, since
these calculations can be done in parallel, we show that using an FPGA
can allow these calculations to be done in under 30 msec. This session
will show how using TigerGraph User Defined Functions (UDF), similarity
calculations, and therefore product recommendations can be done in
real-time as customers visit your web site.
2
 
Overview
Dan:
• What is graph similarity?
• Why is it critical in recommendation systems?
• Serial vs. Parallel algorithms
• Cosine similarity and graph embeddings
Kumar:
• What is an FPGA?
• FPGA configuration used in our benchmarks
• Calling FPGA from TigerGraph using a User Defined Function
• Benchmark Results
Summary: Both
3
 
About Dan
4
• Distinguished Engineer at Optum Healthcare
(330K employees and 32K technical staff)
• Focused on AI enterprise knowledge graphs
• Help create the worlds largest healthcare
graph
• Coauthor of book "Making Sense of NoSQL”
• Worked at Bell Labs as a VLSI circuit designer
• Worked for Steve Jobs at NeXT
 
Why Are Similarity Calculations Critical?
Similarity is at the foundation of recommendation engines
Recommendation engines power sites like:
• Google – recommend a document
• NetFlix™ – recommend a movie
• Amazon - recommend a product
• Pintrest™ – recommend an interest
• Healthcare – recommend a care path
Recommendations must take into account many factors
including recent searches
To be useful in interactive web sites we set a goal of
response times of under 200 milliseconds
5
Similarity
Recommendation
Next Best Action
 
Real-Time Patient Similarity
• Given a new patient that arrives a clinical setting, how can we quickly find the most similar patients?
• Assumption: we have 10M clinical records of our population of 235 million members
• Can we find the 100 most similar patients in under 200 milliseconds?
6
New Patient
Arrives in ER
Sample Patient Populations (10s of millions)
Which ones are
the most similar
to this patient?
 
Similarity Score – A scaled measure of “alikeness” for a context
• A single scaled dimension of comparison for a given setting or context
• Comparing a patient to itself would have a similarity score of 1.0
• Patients that have few common characteristics would have a score of 0.1
7
.43
.92
.1
 
Graph Representation of Patients – Includes Structure
8
Sample Patient Population (10M)
Target Patient
 
Graph Representation of Patients – Includes Structure
9
Sample Patient Population (10M)
Target Patient
How can I quickly compare these graphs and find the most similar patients?
Similarity score = .5
Similarity score = .9
Similarity score = .8
 
Serial vs. Parallel Graph Algorithms
• One task cannot begin before the prior task is
complete
• Task order is important
• Serial algorithms work well on traditional CPUs
• Many tasks can be done independently
• Task order in not relevent
• Tasks can usually be done faster on GPU or
FPGAs
10
Serial Graph Algorithms Parallel Graph Algorithms
Task Task TaskStart End
Task
Task
Task
Start End
 
The Human Brain Does Both Serial and Parallel Computation
Let’s use our brains as a demo!
The following slide has photos of two people
•One is a famous actor
•The other is a synthetically generated image of a person (generated by a AI
program)
How long will it take for you to recognize which one the famous actor?
11
 
Which photo is the famous actor?
12
 
What just happened?
1. Your visual cortex received the images as electrical signals from your eyes
2. Your brain identified key features of each face from the images - in parallel
3. Your brain sent these features as electrical signals to your memories of people’s faces
4. Your brain compared these features to every memory you have ever had of a person’s face – in
parallel
5. Your brain sent their recognition scores to a control center of your brain
6. Your brain’s speech center vocalized the word “right” – in series
13
Key Questions:
1. How does the brain know to pay attention to specific features of a face?
2. What portions of real-time clinical decision support systems can be done cost effectively in parallel?
Answer: The human brain, comprised of around 84B neurons, does both parallel and series calculations
 
Property-based Similarity Example
14
Used to find the most similar items in a graph by comparing properties and structure
Ideal when you a can compare individual features of an item numerically
Algorithms return a ranking of similarity between a target an a population based on the counts and
weights of properties that are similar
Target
Patient
Population of 10M Patients
 
Vector Similarity
15
Vectors are similar in two dimensional space if they have the same length and direction
Compare all the “x” lengths and the “y” lengths and rank by the sum of the totals of the
difference
y
x
Population Vectors
Target
Vector 1st
last
 
Vector Representation
16
Each item can be represented by a series of “feature” vectors
The numbers are scalers
x=8
y=10
16
16
8
6
4
3
7
9
-4
6
-4
6
-12
-8
Target
Vector
Population
Vectors
Most similar
Least similar
 
Cosine Used in Comparing Direction
17
The dot product of two vectors a and b (sometimes called the inner product, or, since its
result is a scalar, the scalar product) is denoted by a ∙ b and is defined as:
If the lines are exactly in the same direction, then theta is 0, cos(0) = 1
If the lines are 90 or -90 degrees apart they are in orthogonal directions cos(90) = 0
a
b
 
Why Vector Conversion
18
Computers are very good at comparing numeric values
Comparison of vectors is a well studied problem (weighted cosine similarity)
Comparing a target vector to many other vectors (50M) is a class of “Embarrassingly Parallel” type problem that is perfect
for hardware acceleration using FPGAs
1
10
5
8
14
3
6
57
34
15
5
66
Input Target Vector
1
32
5
8
14
3
6
57
34
15
5
66
0
45
5
8
14
3
6
57
34
15
5
66
1
10
5
8
14
3
6
57
34
15
5
66
1
10
5
8
14
3
6
57
34
15
5
66 Top 100 Patient IDs
(Tigergraph 64 bit vertex IDs)
Returned in 100msec
Vector Comparison Hardware Service
…50M…
1234
3467
5546
8234
1423
…100
PIDs…
PIDs
200 32-bit integers
 
Can Machine Learning Tell us What Features are Important?
19
Old way: manually create a program that will extract 200 integers
for each customer that classifies their behavior
• Age
• Gender
• Location
• Responsiveness to e-mail survey
• How proactive are they about their health?
• Likely to recommend your company
• Slow process requiring manual coding of feature extraction
rules
New algorithms such as node2vec use a random walk algorithms
to automatically create the 200 integers that will help use
differentiate patients
Embedding: 200 32-bit integers
 
The Rise of Automatic Feature Engineering
20
Recent years have seen a surge in approaches that
automatically learn to encode graph structure into
low-dimensional embeddings.
The central problem in machine learning on graphs is finding a
way to incorporate information about the structure of the graph
into the machine learning model.
From Representation Learning on Graphs: Methods and Applications by Hamilton (et. al.)
 
Example of Graph Embedding – Encode and Decode
21
From Representation Learning on Graphs: Methods and Applications
 
Cosine Used in Comparing Direction
22
The dot product of two vectors a and b (sometimes called the inner product, or, since its
result is a scalar, the scalar product) is denoted by a ∙ b and is defined as:
If the lines are exactly in the same direction, then theta is 0, cos(0) = 1
If the lines are 90 or -90 degrees apart they are in orthogonal directions cos(90) = 0
a
b
 
The Rise of Automatic Feature Engineering
23
Recent years have seen a surge in approaches that automatically learn
to encode graph structure into low-dimensional embeddings.
The central problem in machine learning on graphs is finding a way to
incorporate information about the structure of the graph into the machine
learning model.
From Representation Learning on Graphs: Methods and Applications by Hamilton et. El.
 
Example of Graph Embedding – Encode and Decode
24
From Representation Learning on Graphs: Methods and Applications by Hamilton et. El.
© Copyright 2020 Xilinx
About Kumar
• Distinguished Engineer at Xilinx Inc.
• Focused on Data Analytics Acceleration
• Architected and co-developed Xilinx Simulator, Vitis Profiler
and Debugger from scratch in prior assignments
• 20+ US patents
• Xilinx Inc:
• Inventor of FPGAs
• Leader in Adaptive Compute Acceleration
• ~4.8K employees, ~3B revenue, ~25 B market cap
© Copyright 2020 Xilinx
• Logic blocks
• Look-up tables – combinatorial logic
• Flip flops – sequential logic
• DSP (Digital Signal Processing)
‒ Pre-adder, Multiplier, Accumulator
‒ And, OR, NOT, NAND, NOR, XOR, XNOR
‒ Pattern Detector
• Writable Memory
• LUTRAM (Look-up table RAM)
• BRAM (Block RAM)
• URAM (Ultra RAM)
• Communication
• I/O, Transceiver, PCIe, Ethernet
• Programmable Interconnect
What is an FPGA (Field Programmable Gate Array)?
Credit: https://towardsdatascience.com/introduction-to-fpga-and-its-architecture-20a62c14421c
LUTs: 1.2 M
Flip-Flops: 2.4M
Writable Memory: 47 MB
DSP Units: 6800
Xilinx VU9P FPGA has:
© Copyright 2020 Xilinx
Configuring an FPGA
‘Unprogrammed’
configuration memory (SRAM cells)
Unconfigured
logic circuit
‘Programmed’
configuration memory (SRAM cells)
‘Configure
d’
logic circuitCredit: ‘Bebop to the Boolean Boogie: An Unconventional Guide to Electronics’
Credit: https://www.researchgate.net/publication/288835032_FPGA_Implementation_of_CORDIC_Processor
SRAM Driving a pass transistors to
make connections
© Copyright 2020 Xilinx
High-Performance FPGA Applications: Think “Parallel”
Data-level parallelism
Processing different blocks of a data set in parallel
Task-level parallelism
Executing different tasks in parallel
Executing different tasks in a pipelined fashion
Instruction-level parallelism
Parallel instructions (superscalar)
Pipelined instructions
Bit-level parallelism
Custom word width
Task CTask B
Task A
Task D
© Copyright 2020 Xilinx
>> 29
Computing Devices
CPU GPU FPGA ASIC
Example AMD EPYC 7702 NVIDIA A100 Xilinx Alveo U50 Google TPU
Architecture Instruction Set Instruction Set Domain Specific Domain Specific
Purpose General Purpose General Purpose Domain Specific Domain Specific
Workload Types Serialized
Workloads
Parallel
Workloads
Any workload Single Workload
Ease of
Programming
Easy Medium Medium No
programmability
Energy Efficiency Low Medium High Very High
© Copyright 2020 Xilinx
Using C, C++ or OpenCL to Program FPGAs
˃ Xilinx pioneered C to FPGA compilation technology (aka “HLS”) in 2011
˃ No need for low-level hardware description languages
˃ FPGAs are “Software Programmable”
loop_main:for(int j=0;j<NUM_SIMGROUPS;j+=2) {
loop_share:for(uint k=0;k<NUM_SIMS;k++) {
loop_parallel:for(int i=0;i<NUM_RNGS;i++) {
mt_rng[i].BOX_MULLER(&num1[i][k],&num2[i][k],ratio4,ratio3);
float payoff1 = expf(num1[i][k])-1.0f;
float payoff2 = expf(num2[i][k])-1.0f;
if(num1[i][k]>0.0f)
pCall1[i][k]+= payoff1;
else
pPut1[i][k]-=payoff1;
if(num2[i][k]>0.0f)
pCall2[i][k]+=payoff2;
else
pPut2[i][k]-=payoff2;
}
}
}
FPGAVitis Compiler (v++)
© Copyright 2020 Xilinx
Software Programmability: FPGA Development in C/C++
PCI
e
x86 CPU
Host
Application
Runtime and Drivers
Acceleration API
FPGA
Accelerated
Functions
DMA Engine
AXI Interfaces
User
Application
Code
Xilinx
Acceleration
Platform
C/C++ code
Synthesizable
C/C++
GCC VITIS
© Copyright 2020 Xilinx
U50 U200 U280U250
Cloud On-premise
Cosine
Similarity
HLS (C++)
TigerGraph
Xilinx Accelerated TigerGraph
>> 32
Vitis core
development kit
compilers
Math, Linear Algebra,
Database, Vision, AI,
Security Libraries
Vitis accelerated
libraries
Vitis drivers & runtime (XRT)
analyzers debuggers
Vitis target platforms
Graph Algorithms
and User Defined
Functions (UDFs)
© Copyright 2020 Xilinx
Benchmark: Similarity for 1.5 million patients
HPE DL385 Gen10+ server
2x AMD EPYC 7742 @ 2.2GHz
128 cores, 256 GB RAM
Xilinx Alveo U50 PCIE Accelerator Card
8GB HBM, 75W
Dataset: Synthetic patient data generated by “Synthea” (https://synthetichealth.github.io/synthea/)
Algorithm: Cosine Similarity (cos theta between property vectors)
Property Vector: 197 int properties for each patient
© Copyright 2020 Xilinx
Cosine Similarity: Accelerated Function
extern "C" void topKCosSim(uint32_t p_n, …, KVResType* p_res) {
…
for (int i = 0; i < SPATIAL_numChannels; i++) {
#pragma HLS UNROLL
patientInfoParse<>(p_n, p_m, l_strXs[i], l_dataX[i], l_normX[i]);
patientInfoParse<>(p_n, p_m, l_strY[i], l_dataY[i], l_id[i], l_normY[i]);
cos<SPATIAL_logParEntries, SPATIAL_macDataType,
SPATIAL_indexType>(p_n, p_m, l_dataX[i], l_dataY[i], l_normX[i], l_normY[i],
l_dis[i]);
addKey<>(p_m, l_id[i], l_dis[i], l_pair[i]);
}
mergeStream<SPATIAL_numChannels>(p_m, l_pair, l_merge);
maxK<SPATIAL_maxK>(p_m * SPATIAL_numChannels, p_k, l_merge, l_res);
stream2mem<>(p_k, l_res, p_res);
}
v++ similarity.xclbin
© Copyright 2020 Xilinx
Cosine Similarity: Host Application
…
for (unsigned int di = 0;di<deviceCount; di++){
KVResType* l_res0_tmp;
KVResType* l_res1_tmp;
posix_memalign((void**)&l_res0_tmp, 4096, l_k * sizeof(KVResType));
memset(l_res0_tmp, 0, l_k * sizeof(KVResType));
posix_memalign((void**)&l_res1_tmp, 4096, l_k * sizeof(KVResType));
memset(l_res1_tmp, 0, l_k * sizeof(KVResType));
l_res0[di]=l_res0_tmp;
l_res1[di]=l_res1_tmp;
xfspatialSendMat(l_res0[di], l_k * sizeof(KVResType), 35, 1, di);
xfspatialSendMat(l_res1[di], l_k * sizeof(KVResType), 32, 0, di);
xfspatialTopKCosSim(appContext->m_vecX0, l_res0[di], l_n, l_m, l_k, 1, di);
xfspatialTopKCosSim(appContext->m_vecX1, l_res1[di], l_n, l_m, l_k, 0, di);
}
xfspatialExecuteKernelAsync(2, deviceCount);
…
Compile (g++) libxilinxsimilarity.so
© Copyright 2020 Xilinx
Integration with TigerGraph
<TG Install Dir>/dev/gdk/gsql/src/QueryUdf/ExprFunctions.hpp:
inline string udf_open_alveo(int mode) { … }
inline bool udf_close_alveo(int mode) { … }
inline bool udf_write_device(int mode) { … }
inline ListAccum<testResults>
udf_cosinesim_ss_alveo(ListAccum<int64_t>& patient_vector,
uint64_t topK) { … }
similarity.gsql:
open_alveo() { …; udf_open_alveo(1); …}
load_alveo() {…; udf_write_device(1); …}
close_alveo() { …; udf_close_alveo(1); …}
cosinesim_ss_alveo(newPatient, topK) { …; udf_cosinesim_ss_alveo(newPatient, topK); …}
libxilinxsimilarity.so (code to manage client requests)
Xilinx Runtime (PCIE driver and card management)
similarity.xclbin
© Copyright 2020 Xilinx
Demo: Similarity for 1.5 million patients
© Copyright 2020 Xilinx
Time (milli-seconds) to get top 100 similar patients
40x faster than CPU
Using one Alveo U50 Using 5 Alveo U50’s
400x faster than CPU
© Copyright 2020 Xilinx
Scaling with number of patients
 
Three General REST Services to Support Similarity
40
Bulk Upload Data: input - millions of vectors; output – success/failure code
Update Vector: input - vertex ID, 198 integers; output – success/failure code
Find Similar: input - vertex ID, 198 integers; output – 100 vertex IDs (64 bits)
Similarity Server
REST GET Results (csv, JSON)
 
Onward to the Hardware Graph!
41
Single Node
Graph
Enterprise
Knowledge
Graph
Hardware
Graph
Data Hubs
Data Lake
Algorithmic
Richness
Scale out
 
Related Use Cases
42
Recommendation Engines for Healthcare
• For any person calling in for a recommended provider or senior living facility, can we find similar
recommendations in the past?
Incident Reporting
• When trouble ticket is reported, what are the most similar problems and what were their solutions?
Errors in Log Files
• When there are are error messages in log files, how can we find similar errors and their solutions?
Learning Content
• Can we recommend learning content for employees that have similar goals?
Schema Mapping
• Automate the process of creating data transformation maps for new data to existing schemas

More Related Content

What's hot

What's hot (19)

Graph+AI for Fin. Services
Graph+AI for Fin. ServicesGraph+AI for Fin. Services
Graph+AI for Fin. Services
 
Graph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise GraphGraph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise Graph
 
Graph Gurus Episode 27: Using Graph Algorithms for Advanced Analytics Part 2
Graph Gurus Episode 27: Using Graph Algorithms for Advanced Analytics Part 2Graph Gurus Episode 27: Using Graph Algorithms for Advanced Analytics Part 2
Graph Gurus Episode 27: Using Graph Algorithms for Advanced Analytics Part 2
 
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
 
Graph-Based Identity Resolution at Scale
Graph-Based Identity Resolution at ScaleGraph-Based Identity Resolution at Scale
Graph-Based Identity Resolution at Scale
 
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetGraph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
 
Using Graph Algorithms for Advanced Analytics - Part 2 Centrality
Using Graph Algorithms for Advanced Analytics - Part 2 CentralityUsing Graph Algorithms for Advanced Analytics - Part 2 Centrality
Using Graph Algorithms for Advanced Analytics - Part 2 Centrality
 
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUIMachine Learning Feature Design with TigerGraph 3.0 No-Code GUI
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
 
Using Graph Algorithms for Advanced Analytics - Part 2 Centrality
Using Graph Algorithms for Advanced Analytics - Part 2 CentralityUsing Graph Algorithms for Advanced Analytics - Part 2 Centrality
Using Graph Algorithms for Advanced Analytics - Part 2 Centrality
 
Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4 Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4
 
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 ClassificationUsing Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
 
Graph Gurus Episode 28: In-Database Machine Learning Solution for Real-Time R...
Graph Gurus Episode 28: In-Database Machine Learning Solution for Real-Time R...Graph Gurus Episode 28: In-Database Machine Learning Solution for Real-Time R...
Graph Gurus Episode 28: In-Database Machine Learning Solution for Real-Time R...
 
Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...Better Together: How Graph database enables easy data integration with Spark ...
Better Together: How Graph database enables easy data integration with Spark ...
 
Knowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep LearningKnowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep Learning
 
Supply Chain and Logistics Management with Graph & AI
Supply Chain and Logistics Management with Graph & AISupply Chain and Logistics Management with Graph & AI
Supply Chain and Logistics Management with Graph & AI
 
Plume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryPlume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis Library
 
Graphs for AI & ML, Jim Webber, Neo4j
Graphs for AI & ML, Jim Webber, Neo4jGraphs for AI & ML, Jim Webber, Neo4j
Graphs for AI & ML, Jim Webber, Neo4j
 
Graph Gurus Episode 12: Tiger Graph v2.3 Overview
Graph Gurus Episode 12: Tiger Graph v2.3 OverviewGraph Gurus Episode 12: Tiger Graph v2.3 Overview
Graph Gurus Episode 12: Tiger Graph v2.3 Overview
 
Graph Gurus Episode 3: Anti Fraud and AML Part 1
Graph Gurus Episode 3: Anti Fraud and AML Part 1Graph Gurus Episode 3: Anti Fraud and AML Part 1
Graph Gurus Episode 3: Anti Fraud and AML Part 1
 

Similar to Fast Parallel Similarity Calculations with FPGA Hardware

Questions On The Equation For Regression
Questions On The Equation For RegressionQuestions On The Equation For Regression
Questions On The Equation For Regression
Tiffany Sandoval
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Sri Ambati
 
ICLR 2020 Recap
ICLR 2020 RecapICLR 2020 Recap
ICLR 2020 Recap
Sri Ambati
 

Similar to Fast Parallel Similarity Calculations with FPGA Hardware (20)

Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
 
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 
Era ofdataeconomyv4short
Era ofdataeconomyv4shortEra ofdataeconomyv4short
Era ofdataeconomyv4short
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 
WSO2 Machine Learner - Product Overview
WSO2 Machine Learner - Product OverviewWSO2 Machine Learner - Product Overview
WSO2 Machine Learner - Product Overview
 
Questions On The Equation For Regression
Questions On The Equation For RegressionQuestions On The Equation For Regression
Questions On The Equation For Regression
 
Ai in finance
Ai in financeAi in finance
Ai in finance
 
Image Analytics for Retail
Image Analytics for RetailImage Analytics for Retail
Image Analytics for Retail
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist Deck
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
 
MODEL-DRIVEN ENGINEERING (MDE) in Practice
MODEL-DRIVEN ENGINEERING (MDE) in PracticeMODEL-DRIVEN ENGINEERING (MDE) in Practice
MODEL-DRIVEN ENGINEERING (MDE) in Practice
 
Analytics in Online Retail
Analytics in Online RetailAnalytics in Online Retail
Analytics in Online Retail
 
AI In Actuarial Science
AI In Actuarial ScienceAI In Actuarial Science
AI In Actuarial Science
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
 
ICLR 2020 Recap
ICLR 2020 RecapICLR 2020 Recap
ICLR 2020 Recap
 
DSDT meetup July 2021
DSDT meetup July 2021DSDT meetup July 2021
DSDT meetup July 2021
 

More from TigerGraph

More from TigerGraph (20)

MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATIONMAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
 
Building an accurate understanding of consumers based on real-world signals
Building an accurate understanding of consumers based on real-world signalsBuilding an accurate understanding of consumers based on real-world signals
Building an accurate understanding of consumers based on real-world signals
 
Care Intervention Assistant - Omaha Clinical Data Information System
Care Intervention Assistant - Omaha Clinical Data Information SystemCare Intervention Assistant - Omaha Clinical Data Information System
Care Intervention Assistant - Omaha Clinical Data Information System
 
Correspondent Banking Networks
Correspondent Banking NetworksCorrespondent Banking Networks
Correspondent Banking Networks
 
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
 
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
 
Fraud Detection and Compliance with Graph Learning
Fraud Detection and Compliance with Graph LearningFraud Detection and Compliance with Graph Learning
Fraud Detection and Compliance with Graph Learning
 
Fraudulent credit card cash-out detection On Graphs
Fraudulent credit card cash-out detection On GraphsFraudulent credit card cash-out detection On Graphs
Fraudulent credit card cash-out detection On Graphs
 
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraphFROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
 
Customer Experience Management
Customer Experience ManagementCustomer Experience Management
Customer Experience Management
 
Davraz - A graph visualization and exploration software.
Davraz - A graph visualization and exploration software.Davraz - A graph visualization and exploration software.
Davraz - A graph visualization and exploration software.
 
TigerGraph.js
TigerGraph.jsTigerGraph.js
TigerGraph.js
 
GRAPHS FOR THE FUTURE ENERGY SYSTEMS
GRAPHS FOR THE FUTURE ENERGY SYSTEMSGRAPHS FOR THE FUTURE ENERGY SYSTEMS
GRAPHS FOR THE FUTURE ENERGY SYSTEMS
 
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
 
Recommendation Engine with In-Database Machine Learning
Recommendation Engine with In-Database Machine LearningRecommendation Engine with In-Database Machine Learning
Recommendation Engine with In-Database Machine Learning
 
The key to creating a Golden Thread: the power of Graph Databases for Entity ...
The key to creating a Golden Thread: the power of Graph Databases for Entity ...The key to creating a Golden Thread: the power of Graph Databases for Entity ...
The key to creating a Golden Thread: the power of Graph Databases for Entity ...
 
Training Graph Convolutional Neural Networks in Graph Database
Training Graph Convolutional Neural Networks in Graph DatabaseTraining Graph Convolutional Neural Networks in Graph Database
Training Graph Convolutional Neural Networks in Graph Database
 
Fraud prevention is better with TigerGraph inside
Fraud prevention is better with  TigerGraph insideFraud prevention is better with  TigerGraph inside
Fraud prevention is better with TigerGraph inside
 
Deep Link Analytics Empowered by AI + Graph + Verticals
Deep Link Analytics Empowered by AI + Graph + VerticalsDeep Link Analytics Empowered by AI + Graph + Verticals
Deep Link Analytics Empowered by AI + Graph + Verticals
 
Graph + AI World Opening Keynote
Graph + AI World Opening KeynoteGraph + AI World Opening Keynote
Graph + AI World Opening Keynote
 

Recently uploaded

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 

Recently uploaded (20)

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
hybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptxhybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 

Fast Parallel Similarity Calculations with FPGA Hardware

  • 1. Fast Parallel Similarity Calculations with FPGA Hardware Dan McCreary, Distinguished Engineer, Optum Kumar Deepak, Distinguished Engineer, Xilinx Graph + AI World 2020 September 29, 2020
  • 2.   Talk Description The foundation of recommendation is finding similar customers and their purchasing patterns. Yet, if you have 100 million customers it can take hours to do similarity calculations on just 200 features. However, since these calculations can be done in parallel, we show that using an FPGA can allow these calculations to be done in under 30 msec. This session will show how using TigerGraph User Defined Functions (UDF), similarity calculations, and therefore product recommendations can be done in real-time as customers visit your web site. 2
  • 3.   Overview Dan: • What is graph similarity? • Why is it critical in recommendation systems? • Serial vs. Parallel algorithms • Cosine similarity and graph embeddings Kumar: • What is an FPGA? • FPGA configuration used in our benchmarks • Calling FPGA from TigerGraph using a User Defined Function • Benchmark Results Summary: Both 3
  • 4.   About Dan 4 • Distinguished Engineer at Optum Healthcare (330K employees and 32K technical staff) • Focused on AI enterprise knowledge graphs • Help create the worlds largest healthcare graph • Coauthor of book "Making Sense of NoSQL” • Worked at Bell Labs as a VLSI circuit designer • Worked for Steve Jobs at NeXT
  • 5.   Why Are Similarity Calculations Critical? Similarity is at the foundation of recommendation engines Recommendation engines power sites like: • Google – recommend a document • NetFlix™ – recommend a movie • Amazon - recommend a product • Pintrest™ – recommend an interest • Healthcare – recommend a care path Recommendations must take into account many factors including recent searches To be useful in interactive web sites we set a goal of response times of under 200 milliseconds 5 Similarity Recommendation Next Best Action
  • 6.   Real-Time Patient Similarity • Given a new patient that arrives a clinical setting, how can we quickly find the most similar patients? • Assumption: we have 10M clinical records of our population of 235 million members • Can we find the 100 most similar patients in under 200 milliseconds? 6 New Patient Arrives in ER Sample Patient Populations (10s of millions) Which ones are the most similar to this patient?
  • 7.   Similarity Score – A scaled measure of “alikeness” for a context • A single scaled dimension of comparison for a given setting or context • Comparing a patient to itself would have a similarity score of 1.0 • Patients that have few common characteristics would have a score of 0.1 7 .43 .92 .1
  • 8.   Graph Representation of Patients – Includes Structure 8 Sample Patient Population (10M) Target Patient
  • 9.   Graph Representation of Patients – Includes Structure 9 Sample Patient Population (10M) Target Patient How can I quickly compare these graphs and find the most similar patients? Similarity score = .5 Similarity score = .9 Similarity score = .8
  • 10.   Serial vs. Parallel Graph Algorithms • One task cannot begin before the prior task is complete • Task order is important • Serial algorithms work well on traditional CPUs • Many tasks can be done independently • Task order in not relevent • Tasks can usually be done faster on GPU or FPGAs 10 Serial Graph Algorithms Parallel Graph Algorithms Task Task TaskStart End Task Task Task Start End
  • 11.   The Human Brain Does Both Serial and Parallel Computation Let’s use our brains as a demo! The following slide has photos of two people •One is a famous actor •The other is a synthetically generated image of a person (generated by a AI program) How long will it take for you to recognize which one the famous actor? 11
  • 12.   Which photo is the famous actor? 12
  • 13.   What just happened? 1. Your visual cortex received the images as electrical signals from your eyes 2. Your brain identified key features of each face from the images - in parallel 3. Your brain sent these features as electrical signals to your memories of people’s faces 4. Your brain compared these features to every memory you have ever had of a person’s face – in parallel 5. Your brain sent their recognition scores to a control center of your brain 6. Your brain’s speech center vocalized the word “right” – in series 13 Key Questions: 1. How does the brain know to pay attention to specific features of a face? 2. What portions of real-time clinical decision support systems can be done cost effectively in parallel? Answer: The human brain, comprised of around 84B neurons, does both parallel and series calculations
  • 14.   Property-based Similarity Example 14 Used to find the most similar items in a graph by comparing properties and structure Ideal when you a can compare individual features of an item numerically Algorithms return a ranking of similarity between a target an a population based on the counts and weights of properties that are similar Target Patient Population of 10M Patients
  • 15.   Vector Similarity 15 Vectors are similar in two dimensional space if they have the same length and direction Compare all the “x” lengths and the “y” lengths and rank by the sum of the totals of the difference y x Population Vectors Target Vector 1st last
  • 16.   Vector Representation 16 Each item can be represented by a series of “feature” vectors The numbers are scalers x=8 y=10 16 16 8 6 4 3 7 9 -4 6 -4 6 -12 -8 Target Vector Population Vectors Most similar Least similar
  • 17.   Cosine Used in Comparing Direction 17 The dot product of two vectors a and b (sometimes called the inner product, or, since its result is a scalar, the scalar product) is denoted by a ∙ b and is defined as: If the lines are exactly in the same direction, then theta is 0, cos(0) = 1 If the lines are 90 or -90 degrees apart they are in orthogonal directions cos(90) = 0 a b
  • 18.   Why Vector Conversion 18 Computers are very good at comparing numeric values Comparison of vectors is a well studied problem (weighted cosine similarity) Comparing a target vector to many other vectors (50M) is a class of “Embarrassingly Parallel” type problem that is perfect for hardware acceleration using FPGAs 1 10 5 8 14 3 6 57 34 15 5 66 Input Target Vector 1 32 5 8 14 3 6 57 34 15 5 66 0 45 5 8 14 3 6 57 34 15 5 66 1 10 5 8 14 3 6 57 34 15 5 66 1 10 5 8 14 3 6 57 34 15 5 66 Top 100 Patient IDs (Tigergraph 64 bit vertex IDs) Returned in 100msec Vector Comparison Hardware Service …50M… 1234 3467 5546 8234 1423 …100 PIDs… PIDs 200 32-bit integers
  • 19.   Can Machine Learning Tell us What Features are Important? 19 Old way: manually create a program that will extract 200 integers for each customer that classifies their behavior • Age • Gender • Location • Responsiveness to e-mail survey • How proactive are they about their health? • Likely to recommend your company • Slow process requiring manual coding of feature extraction rules New algorithms such as node2vec use a random walk algorithms to automatically create the 200 integers that will help use differentiate patients Embedding: 200 32-bit integers
  • 20.   The Rise of Automatic Feature Engineering 20 Recent years have seen a surge in approaches that automatically learn to encode graph structure into low-dimensional embeddings. The central problem in machine learning on graphs is finding a way to incorporate information about the structure of the graph into the machine learning model. From Representation Learning on Graphs: Methods and Applications by Hamilton (et. al.)
  • 21.   Example of Graph Embedding – Encode and Decode 21 From Representation Learning on Graphs: Methods and Applications
  • 22.   Cosine Used in Comparing Direction 22 The dot product of two vectors a and b (sometimes called the inner product, or, since its result is a scalar, the scalar product) is denoted by a ∙ b and is defined as: If the lines are exactly in the same direction, then theta is 0, cos(0) = 1 If the lines are 90 or -90 degrees apart they are in orthogonal directions cos(90) = 0 a b
  • 23.   The Rise of Automatic Feature Engineering 23 Recent years have seen a surge in approaches that automatically learn to encode graph structure into low-dimensional embeddings. The central problem in machine learning on graphs is finding a way to incorporate information about the structure of the graph into the machine learning model. From Representation Learning on Graphs: Methods and Applications by Hamilton et. El.
  • 24.   Example of Graph Embedding – Encode and Decode 24 From Representation Learning on Graphs: Methods and Applications by Hamilton et. El.
  • 25. © Copyright 2020 Xilinx About Kumar • Distinguished Engineer at Xilinx Inc. • Focused on Data Analytics Acceleration • Architected and co-developed Xilinx Simulator, Vitis Profiler and Debugger from scratch in prior assignments • 20+ US patents • Xilinx Inc: • Inventor of FPGAs • Leader in Adaptive Compute Acceleration • ~4.8K employees, ~3B revenue, ~25 B market cap
  • 26. © Copyright 2020 Xilinx • Logic blocks • Look-up tables – combinatorial logic • Flip flops – sequential logic • DSP (Digital Signal Processing) ‒ Pre-adder, Multiplier, Accumulator ‒ And, OR, NOT, NAND, NOR, XOR, XNOR ‒ Pattern Detector • Writable Memory • LUTRAM (Look-up table RAM) • BRAM (Block RAM) • URAM (Ultra RAM) • Communication • I/O, Transceiver, PCIe, Ethernet • Programmable Interconnect What is an FPGA (Field Programmable Gate Array)? Credit: https://towardsdatascience.com/introduction-to-fpga-and-its-architecture-20a62c14421c LUTs: 1.2 M Flip-Flops: 2.4M Writable Memory: 47 MB DSP Units: 6800 Xilinx VU9P FPGA has:
  • 27. © Copyright 2020 Xilinx Configuring an FPGA ‘Unprogrammed’ configuration memory (SRAM cells) Unconfigured logic circuit ‘Programmed’ configuration memory (SRAM cells) ‘Configure d’ logic circuitCredit: ‘Bebop to the Boolean Boogie: An Unconventional Guide to Electronics’ Credit: https://www.researchgate.net/publication/288835032_FPGA_Implementation_of_CORDIC_Processor SRAM Driving a pass transistors to make connections
  • 28. © Copyright 2020 Xilinx High-Performance FPGA Applications: Think “Parallel” Data-level parallelism Processing different blocks of a data set in parallel Task-level parallelism Executing different tasks in parallel Executing different tasks in a pipelined fashion Instruction-level parallelism Parallel instructions (superscalar) Pipelined instructions Bit-level parallelism Custom word width Task CTask B Task A Task D
  • 29. © Copyright 2020 Xilinx >> 29 Computing Devices CPU GPU FPGA ASIC Example AMD EPYC 7702 NVIDIA A100 Xilinx Alveo U50 Google TPU Architecture Instruction Set Instruction Set Domain Specific Domain Specific Purpose General Purpose General Purpose Domain Specific Domain Specific Workload Types Serialized Workloads Parallel Workloads Any workload Single Workload Ease of Programming Easy Medium Medium No programmability Energy Efficiency Low Medium High Very High
  • 30. © Copyright 2020 Xilinx Using C, C++ or OpenCL to Program FPGAs ˃ Xilinx pioneered C to FPGA compilation technology (aka “HLS”) in 2011 ˃ No need for low-level hardware description languages ˃ FPGAs are “Software Programmable” loop_main:for(int j=0;j<NUM_SIMGROUPS;j+=2) { loop_share:for(uint k=0;k<NUM_SIMS;k++) { loop_parallel:for(int i=0;i<NUM_RNGS;i++) { mt_rng[i].BOX_MULLER(&num1[i][k],&num2[i][k],ratio4,ratio3); float payoff1 = expf(num1[i][k])-1.0f; float payoff2 = expf(num2[i][k])-1.0f; if(num1[i][k]>0.0f) pCall1[i][k]+= payoff1; else pPut1[i][k]-=payoff1; if(num2[i][k]>0.0f) pCall2[i][k]+=payoff2; else pPut2[i][k]-=payoff2; } } } FPGAVitis Compiler (v++)
  • 31. © Copyright 2020 Xilinx Software Programmability: FPGA Development in C/C++ PCI e x86 CPU Host Application Runtime and Drivers Acceleration API FPGA Accelerated Functions DMA Engine AXI Interfaces User Application Code Xilinx Acceleration Platform C/C++ code Synthesizable C/C++ GCC VITIS
  • 32. © Copyright 2020 Xilinx U50 U200 U280U250 Cloud On-premise Cosine Similarity HLS (C++) TigerGraph Xilinx Accelerated TigerGraph >> 32 Vitis core development kit compilers Math, Linear Algebra, Database, Vision, AI, Security Libraries Vitis accelerated libraries Vitis drivers & runtime (XRT) analyzers debuggers Vitis target platforms Graph Algorithms and User Defined Functions (UDFs)
  • 33. © Copyright 2020 Xilinx Benchmark: Similarity for 1.5 million patients HPE DL385 Gen10+ server 2x AMD EPYC 7742 @ 2.2GHz 128 cores, 256 GB RAM Xilinx Alveo U50 PCIE Accelerator Card 8GB HBM, 75W Dataset: Synthetic patient data generated by “Synthea” (https://synthetichealth.github.io/synthea/) Algorithm: Cosine Similarity (cos theta between property vectors) Property Vector: 197 int properties for each patient
  • 34. © Copyright 2020 Xilinx Cosine Similarity: Accelerated Function extern "C" void topKCosSim(uint32_t p_n, …, KVResType* p_res) { … for (int i = 0; i < SPATIAL_numChannels; i++) { #pragma HLS UNROLL patientInfoParse<>(p_n, p_m, l_strXs[i], l_dataX[i], l_normX[i]); patientInfoParse<>(p_n, p_m, l_strY[i], l_dataY[i], l_id[i], l_normY[i]); cos<SPATIAL_logParEntries, SPATIAL_macDataType, SPATIAL_indexType>(p_n, p_m, l_dataX[i], l_dataY[i], l_normX[i], l_normY[i], l_dis[i]); addKey<>(p_m, l_id[i], l_dis[i], l_pair[i]); } mergeStream<SPATIAL_numChannels>(p_m, l_pair, l_merge); maxK<SPATIAL_maxK>(p_m * SPATIAL_numChannels, p_k, l_merge, l_res); stream2mem<>(p_k, l_res, p_res); } v++ similarity.xclbin
  • 35. © Copyright 2020 Xilinx Cosine Similarity: Host Application … for (unsigned int di = 0;di<deviceCount; di++){ KVResType* l_res0_tmp; KVResType* l_res1_tmp; posix_memalign((void**)&l_res0_tmp, 4096, l_k * sizeof(KVResType)); memset(l_res0_tmp, 0, l_k * sizeof(KVResType)); posix_memalign((void**)&l_res1_tmp, 4096, l_k * sizeof(KVResType)); memset(l_res1_tmp, 0, l_k * sizeof(KVResType)); l_res0[di]=l_res0_tmp; l_res1[di]=l_res1_tmp; xfspatialSendMat(l_res0[di], l_k * sizeof(KVResType), 35, 1, di); xfspatialSendMat(l_res1[di], l_k * sizeof(KVResType), 32, 0, di); xfspatialTopKCosSim(appContext->m_vecX0, l_res0[di], l_n, l_m, l_k, 1, di); xfspatialTopKCosSim(appContext->m_vecX1, l_res1[di], l_n, l_m, l_k, 0, di); } xfspatialExecuteKernelAsync(2, deviceCount); … Compile (g++) libxilinxsimilarity.so
  • 36. © Copyright 2020 Xilinx Integration with TigerGraph <TG Install Dir>/dev/gdk/gsql/src/QueryUdf/ExprFunctions.hpp: inline string udf_open_alveo(int mode) { … } inline bool udf_close_alveo(int mode) { … } inline bool udf_write_device(int mode) { … } inline ListAccum<testResults> udf_cosinesim_ss_alveo(ListAccum<int64_t>& patient_vector, uint64_t topK) { … } similarity.gsql: open_alveo() { …; udf_open_alveo(1); …} load_alveo() {…; udf_write_device(1); …} close_alveo() { …; udf_close_alveo(1); …} cosinesim_ss_alveo(newPatient, topK) { …; udf_cosinesim_ss_alveo(newPatient, topK); …} libxilinxsimilarity.so (code to manage client requests) Xilinx Runtime (PCIE driver and card management) similarity.xclbin
  • 37. © Copyright 2020 Xilinx Demo: Similarity for 1.5 million patients
  • 38. © Copyright 2020 Xilinx Time (milli-seconds) to get top 100 similar patients 40x faster than CPU Using one Alveo U50 Using 5 Alveo U50’s 400x faster than CPU
  • 39. © Copyright 2020 Xilinx Scaling with number of patients
  • 40.   Three General REST Services to Support Similarity 40 Bulk Upload Data: input - millions of vectors; output – success/failure code Update Vector: input - vertex ID, 198 integers; output – success/failure code Find Similar: input - vertex ID, 198 integers; output – 100 vertex IDs (64 bits) Similarity Server REST GET Results (csv, JSON)
  • 41.   Onward to the Hardware Graph! 41 Single Node Graph Enterprise Knowledge Graph Hardware Graph Data Hubs Data Lake Algorithmic Richness Scale out
  • 42.   Related Use Cases 42 Recommendation Engines for Healthcare • For any person calling in for a recommended provider or senior living facility, can we find similar recommendations in the past? Incident Reporting • When trouble ticket is reported, what are the most similar problems and what were their solutions? Errors in Log Files • When there are are error messages in log files, how can we find similar errors and their solutions? Learning Content • Can we recommend learning content for employees that have similar goals? Schema Mapping • Automate the process of creating data transformation maps for new data to existing schemas