SlideShare a Scribd company logo
Large Scale Graph Processing & Machine
Learning for Fraud Prevention
1
Disclaimer: views expressed here are my own and do not
necessarily represent the views of PayPal, its affiliates or
subsidiaries.
© 2016 PayPal Inc. Confidential and proprietary.
Agenda
• Graph Processing
• Machine Learning
2
© 2016 PayPal Inc. Confidential and proprietary.
Graph Processing - Motivation
• (Big) Data is not flat
• Data is multi-relational, spatial, temporal,
multi-modal
• Machine learning algorithms assume i.i.d
(independent & identically distributed)
• Need Data Science for Graphs
3
Software
(Java/JVM)
Source: Lise Getoor, UCSC
© 2016 PayPal Inc. Confidential and proprietary.
Graph Processing - Motivation
• Much harder to get performance lift on
our flagship models
• Need to re-look at all aspects of
traditional model building
• Need out-of-the-box thinking
4
Area we are missing (AUC 0.98)
© 2016 PayPal Inc. Confidential and proprietary.
Graph Processing With Giraph (Current)
• Pregel (Google’2010)
• Vertex Centric
• Giraph (Apache Open Source)
5
© 2016 PayPal Inc. Confidential and proprietary.
Graph Processing With Giraph (Current)
• Used to compute 2-step neighbhors – IP,VID
• Data
• 5 events (login, logout, transaction, fso, signup)
• Stampy HDFS
• 1 year
• Workflow
• Series of (10) map-reduce jobs to join/pre-process
construct graph
• Neighorhood computation uses Giraph
• Output pushed out online as RADD
• Issues
• 2 days for end-to-end process
• Lots of intermediate read/write
• Hadoop cluster stability
• Unsupported by Hortonworks
6
Hadoop HDFS
(Stampy)
RADD
M/R Jobs
M/R Jobs
Giraph Jobs
Giraph Job
Online
Offline
© 2016 PayPal Inc. Confidential and proprietary.
Graph Processing (Future)
• Leverage graph representation to understand, analyze, visualize
relationships – their strengths, direction & influence and to
predict future relationships and interactions between entities
(consumers, merchant, IP address, device IDs, nodes in a
network…).
• Algorithmic convergence of graph theory & machine learning
(statistical relations learning & probabilistic graph modeling)
• Multi-layered full fledged specialized stack, hardware to
software.
7
GPU, FPGA, TPU., SSD
(specialized hardware)
Graph DB
(specialized software)
Graph Algorithms
(Deterministic &
Probabilistic)
Graph
Analytics &
Visualization
Graph
Modeling
(Prediction)
© 2016 PayPal Inc. Confidential and proprietary.
Graph Processing (Future)
• Solve new problems that traditional approaches can not solve:
• What is the likelihood that particular consumer (node in graph) colluding
(hidden relationship in graph) with a particular merchant (node in graph)?
• How fraudsters (series of nodes & edges) move money through our network
i.e., anti-money laundering?
• If a consumer (node) is buying (relation) product X (node) from merchant
(node), what else can we recommend to the consumer?
• If a server (node) in a data center goes down/unhealthy, what would would be
the impact to other critical components?
• Who (what) are the central influential people (assets) in our network?
• Graph theory & algorithms (link analysis, page rank, belief propagation, markov
random field) can help to uncover influencing nodes & hidden relationships.
• Graph databases are much more efficient in unearthing graph properties, have flexible
schema that is amenable to changing data, scale naturally to large data sets as they do
not require expensive join operations.
8
© 2016 PayPal Inc. Confidential and proprietary.
Graph Processing POC
• POC with new stack
• Single powerful machine in-lieu of 1200+ node
stampy cluster
• 160 core
• State-of-the-art NVidia Pascal GPU (4 P100)
• 1 TB RAM
• 2 TB SSD
• Graph DB optimized for scale-up architecture (Neo4j)
• Graph computing frameworks optimized for single
server/GPUs
9
Hadoop HDFS
(Stampy)
RADD
Graph DB
(Neo4J)
Online
Offline
Hardware
P100 GPU; SSD
Graph
Algorithms
(Neo4J)
Graph
Algorithms
(GunRock)
© 2016 PayPal Inc. Confidential and proprietary.
Graph Processing POC
• Graph
• 1.5 billion edges; 0.5 billion vertices
• Static graph (1 year); No time dimension
• Vertices - Customer ID, Merchant ID, IP;
Edges – payment transaction
10
Degree 2^0: 58104040 (58.10%)
Degree 2^1: 23034683 (23.03%)
Degree 2^2: 11711920 (11.71%)
Degree 2^3: 5098768 (5.10%)
Degree 2^4: 1575787 (1.58%)
Degree 2^5: 366774 (0.37%)
Degree 2^6: 79654 (0.08%)
Degree 2^7: 19796 (0.02%)
Degree 2^8: 5861 (0.01%)
Degree 2^9: 1886 (0.00%)
Degree 2^10: 581 (0.00%)
Degree 2^11: 179 (0.00%)
Degree 2^12: 52 (0.00%)
Degree 2^13: 12 (0.00%)
Degree 2^14: 4 (0.00%)
Degree 2^15: 1 (0.00%)
Degree Distribution
© 2016 PayPal Inc. Confidential and proprietary.
Graph Processing POC
• GraphAlgorithms
• Page Rank
• Connected Components
• find sets of connected
nodes
• Each node is reachable
from any other node in the
set
• Community Detection
• Label Propagation – Group
nodes with similar
properties
11
© 2016 PayPal Inc. Confidential and proprietary.
Graph Processing POC
• Results
• Neo4j
• GunRock, NvGraph, SNAP – in
progress
12
Algorithm Compute
Time (ms)
Write Time
(ms)
Page Rank 77 124
Connected
Components
156 337
Community
Detection
384 361
© 2016 PayPal Inc. Confidential and proprietary.
Graph Processing - Key Learnings
• Scale-up showing promise
• 1 year graph ~70GB
• Can fit entire graph in 1 machine’s memory
• Significant reduction in processing time
• ~2 days -> ~2 hours
• Compute ~77 milliseconds
• Significant value in Graph DB – sub-graph creation,
interactive exploration, visualization, savings in
computation
13
Source: Lise Getoor, UCSC
Machine Learning - Current Research Workflow
Hardware
-
Research,
Procure,
Provision
Modeling
Algorithm
Research
Tool -
Evaluate,
Benchmar
k
Data &
Variable
Collection
and
Selection
Model
Training -
Developm
ent,
Performan
ce Eval,
Spec
(offline)
Scoring
Engine
Developm
ent
(online)
Deployme
nt - One
Box &
Load test
Auditing -
Variable &
Score
Go Live
©2015 PayPal Inc. Confidential and proprietary. 14
• Manual Development (Spec -> Code)
• ML Tool Specific Software Stack
• Lack of research infrastructure/self-service tools
• Serial pipeline; No early involvement of relevant teams
© 2016 PayPal Inc. Confidential and proprietary.
Machine Learning - Future Modeling/Prediction Stack
• Algorithm
• Deep Learning
• Offline to Online Learning
• Human to Auto Feature Engineering
• Model Spec
• Language Agnostic Platform agnostic Spec
(ex: protobufs)
• Tools
• Scalable & Flexible (ex: TensorFlow)
• Software
• Language Agnostic Software Stack
• Hardware
• Accelerators (GPUs, TPUs,..)
15
SS
Hardware
CPU + GPU/TPU
Model Spec
(protobufs)
Tools
(TensorFlow)
Algorithm
(Deep Learning)
Software
(Language Agnostic)
© 2016 PayPal Inc. Confidential and proprietary.
Conclusions
• Graph algorithms & Deep Learning shows promising
analytical results
• Scale-up architecture shows promise for our use cases
• Probabilistic Graph Modeling + Deep Learning =>
Interpretable Models
• Scratched surface; lot left to accomplish
16
Software
(Java/JVM)
Source: Lise Getoor, UCSC

More Related Content

What's hot

Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...
Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...
Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...
Databricks
 
Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017
Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017
Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017
MLconf
 
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa... Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Databricks
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
Databricks
 
Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016
MLconf
 
Auto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningAuto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine Learning
Databricks
 
Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4 Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4
TigerGraph
 
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML ToolkitAugmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Databricks
 
Conversational AI with Transformer Models
Conversational AI with Transformer ModelsConversational AI with Transformer Models
Conversational AI with Transformer Models
Databricks
 
The Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkThe Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in Spark
Spark Summit
 
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Databricks
 
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
 ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens... ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
Databricks
 
Application and Challenges of Streaming Analytics and Machine Learning on Mu...
 Application and Challenges of Streaming Analytics and Machine Learning on Mu... Application and Challenges of Streaming Analytics and Machine Learning on Mu...
Application and Challenges of Streaming Analytics and Machine Learning on Mu...
Databricks
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Databricks
 
Plume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryPlume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis Library
TigerGraph
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Databricks
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
MLconf
 
SparkML: Easy ML Productization for Real-Time Bidding
SparkML: Easy ML Productization for Real-Time BiddingSparkML: Easy ML Productization for Real-Time Bidding
SparkML: Easy ML Productization for Real-Time Bidding
Databricks
 
Graph representation learning to prevent payment collusion fraud
Graph representation learning to prevent payment collusion fraudGraph representation learning to prevent payment collusion fraud
Graph representation learning to prevent payment collusion fraud
DataWorks Summit
 
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
TigerGraph
 

What's hot (20)

Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...
Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...
Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...
 
Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017
Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017
Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017
 
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa... Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
 
Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016
 
Auto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningAuto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine Learning
 
Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4 Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4
 
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML ToolkitAugmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
 
Conversational AI with Transformer Models
Conversational AI with Transformer ModelsConversational AI with Transformer Models
Conversational AI with Transformer Models
 
The Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkThe Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in Spark
 
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
 
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
 ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens... ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
 
Application and Challenges of Streaming Analytics and Machine Learning on Mu...
 Application and Challenges of Streaming Analytics and Machine Learning on Mu... Application and Challenges of Streaming Analytics and Machine Learning on Mu...
Application and Challenges of Streaming Analytics and Machine Learning on Mu...
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
Plume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryPlume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis Library
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
 
SparkML: Easy ML Productization for Real-Time Bidding
SparkML: Easy ML Productization for Real-Time BiddingSparkML: Easy ML Productization for Real-Time Bidding
SparkML: Easy ML Productization for Real-Time Bidding
 
Graph representation learning to prevent payment collusion fraud
Graph representation learning to prevent payment collusion fraudGraph representation learning to prevent payment collusion fraud
Graph representation learning to prevent payment collusion fraud
 
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
 

Viewers also liked

Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
MLconf
 
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
MLconf
 
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
MLconf
 
Jonas Schneider, Head of Engineering for Robotics, OpenAI
Jonas Schneider, Head of Engineering for Robotics, OpenAIJonas Schneider, Head of Engineering for Robotics, OpenAI
Jonas Schneider, Head of Engineering for Robotics, OpenAI
MLconf
 
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
MLconf
 
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
MLconf
 
Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017
MLconf
 
Ashfaq Munshi, ML7 Fellow, Pepperdata
Ashfaq Munshi, ML7 Fellow, PepperdataAshfaq Munshi, ML7 Fellow, Pepperdata
Ashfaq Munshi, ML7 Fellow, Pepperdata
MLconf
 
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
MLconf
 
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
MLconf
 
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
MLconf
 
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
MLconf
 
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
MLconf
 
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
MLconf
 
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
MLconf
 
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
MLconf
 
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
MLconf
 
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
MLconf
 

Viewers also liked (18)

Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
 
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
 
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
 
Jonas Schneider, Head of Engineering for Robotics, OpenAI
Jonas Schneider, Head of Engineering for Robotics, OpenAIJonas Schneider, Head of Engineering for Robotics, OpenAI
Jonas Schneider, Head of Engineering for Robotics, OpenAI
 
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
 
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017
 
Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017
 
Ashfaq Munshi, ML7 Fellow, Pepperdata
Ashfaq Munshi, ML7 Fellow, PepperdataAshfaq Munshi, ML7 Fellow, Pepperdata
Ashfaq Munshi, ML7 Fellow, Pepperdata
 
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
 
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
 
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
 
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
 
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
 
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
 
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
 
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
 
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
 
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
 

Similar to Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017

Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big Graphs
Petr Novotný
 
High Performance Spatial-Temporal Trajectory Analysis with Spark
High Performance Spatial-Temporal Trajectory Analysis with Spark High Performance Spatial-Temporal Trajectory Analysis with Spark
High Performance Spatial-Temporal Trajectory Analysis with Spark
DataWorks Summit/Hadoop Summit
 
Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create
PyData
 
Neo4j GraphTour New York_EY Presentation_Michael Moore
Neo4j GraphTour New York_EY Presentation_Michael MooreNeo4j GraphTour New York_EY Presentation_Michael Moore
Neo4j GraphTour New York_EY Presentation_Michael Moore
Neo4j
 
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017
Codemotion
 
Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph StrategyYour Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph Strategy
Neo4j
 
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
Mathieu Dumoulin
 
Operational Intelligence Using Hadoop
Operational Intelligence Using HadoopOperational Intelligence Using Hadoop
Operational Intelligence Using HadoopDataWorks Summit
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
MapR Technologies
 
The Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management StackThe Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management Stack
SnapLogic
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overview
Abhishek Roy
 
In memory computing principles by Mac Moore of GridGain
In memory computing principles by Mac Moore of GridGainIn memory computing principles by Mac Moore of GridGain
In memory computing principles by Mac Moore of GridGain
Data Con LA
 
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Kent Graziano
 
Advanced Analytics in Hadoop
Advanced Analytics in HadoopAdvanced Analytics in Hadoop
Advanced Analytics in Hadoop
AnalyticsWeek
 
Resume quaish abuzer
Resume quaish abuzerResume quaish abuzer
Resume quaish abuzer
quaish abuzer
 
Roadmap for Enterprise Graph Strategy
Roadmap for Enterprise Graph StrategyRoadmap for Enterprise Graph Strategy
Roadmap for Enterprise Graph Strategy
Neo4j
 
Making Hadoop Realtime by Dr. William Bain of Scaleout Software
Making Hadoop Realtime by Dr. William Bain of Scaleout SoftwareMaking Hadoop Realtime by Dr. William Bain of Scaleout Software
Making Hadoop Realtime by Dr. William Bain of Scaleout Software
Data Con LA
 
Srikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copySrikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copy
srikanth K
 
An Introduction to Graph: Database, Analytics, and Cloud Services
An Introduction to Graph:  Database, Analytics, and Cloud ServicesAn Introduction to Graph:  Database, Analytics, and Cloud Services
An Introduction to Graph: Database, Analytics, and Cloud Services
Jean Ihm
 
Getting Started with Apache Ignite as a Distributed Database
Getting Started with Apache Ignite as a Distributed DatabaseGetting Started with Apache Ignite as a Distributed Database
Getting Started with Apache Ignite as a Distributed Database
Roman Shtykh
 

Similar to Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017 (20)

Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big Graphs
 
High Performance Spatial-Temporal Trajectory Analysis with Spark
High Performance Spatial-Temporal Trajectory Analysis with Spark High Performance Spatial-Temporal Trajectory Analysis with Spark
High Performance Spatial-Temporal Trajectory Analysis with Spark
 
Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create
 
Neo4j GraphTour New York_EY Presentation_Michael Moore
Neo4j GraphTour New York_EY Presentation_Michael MooreNeo4j GraphTour New York_EY Presentation_Michael Moore
Neo4j GraphTour New York_EY Presentation_Michael Moore
 
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017
 
Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph StrategyYour Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph Strategy
 
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
 
Operational Intelligence Using Hadoop
Operational Intelligence Using HadoopOperational Intelligence Using Hadoop
Operational Intelligence Using Hadoop
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
The Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management StackThe Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management Stack
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overview
 
In memory computing principles by Mac Moore of GridGain
In memory computing principles by Mac Moore of GridGainIn memory computing principles by Mac Moore of GridGain
In memory computing principles by Mac Moore of GridGain
 
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
 
Advanced Analytics in Hadoop
Advanced Analytics in HadoopAdvanced Analytics in Hadoop
Advanced Analytics in Hadoop
 
Resume quaish abuzer
Resume quaish abuzerResume quaish abuzer
Resume quaish abuzer
 
Roadmap for Enterprise Graph Strategy
Roadmap for Enterprise Graph StrategyRoadmap for Enterprise Graph Strategy
Roadmap for Enterprise Graph Strategy
 
Making Hadoop Realtime by Dr. William Bain of Scaleout Software
Making Hadoop Realtime by Dr. William Bain of Scaleout SoftwareMaking Hadoop Realtime by Dr. William Bain of Scaleout Software
Making Hadoop Realtime by Dr. William Bain of Scaleout Software
 
Srikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copySrikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copy
 
An Introduction to Graph: Database, Analytics, and Cloud Services
An Introduction to Graph:  Database, Analytics, and Cloud ServicesAn Introduction to Graph:  Database, Analytics, and Cloud Services
An Introduction to Graph: Database, Analytics, and Cloud Services
 
Getting Started with Apache Ignite as a Distributed Database
Getting Started with Apache Ignite as a Distributed DatabaseGetting Started with Apache Ignite as a Distributed Database
Getting Started with Apache Ignite as a Distributed Database
 

More from MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
MLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
MLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
MLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
MLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
MLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
MLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
MLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
MLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
MLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
MLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
MLconf
 

More from MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Recently uploaded

UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
UiPathCommunity
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
Globus
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 

Recently uploaded (20)

UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 

Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017

  • 1. Large Scale Graph Processing & Machine Learning for Fraud Prevention 1 Disclaimer: views expressed here are my own and do not necessarily represent the views of PayPal, its affiliates or subsidiaries.
  • 2. © 2016 PayPal Inc. Confidential and proprietary. Agenda • Graph Processing • Machine Learning 2
  • 3. © 2016 PayPal Inc. Confidential and proprietary. Graph Processing - Motivation • (Big) Data is not flat • Data is multi-relational, spatial, temporal, multi-modal • Machine learning algorithms assume i.i.d (independent & identically distributed) • Need Data Science for Graphs 3 Software (Java/JVM) Source: Lise Getoor, UCSC
  • 4. © 2016 PayPal Inc. Confidential and proprietary. Graph Processing - Motivation • Much harder to get performance lift on our flagship models • Need to re-look at all aspects of traditional model building • Need out-of-the-box thinking 4 Area we are missing (AUC 0.98)
  • 5. © 2016 PayPal Inc. Confidential and proprietary. Graph Processing With Giraph (Current) • Pregel (Google’2010) • Vertex Centric • Giraph (Apache Open Source) 5
  • 6. © 2016 PayPal Inc. Confidential and proprietary. Graph Processing With Giraph (Current) • Used to compute 2-step neighbhors – IP,VID • Data • 5 events (login, logout, transaction, fso, signup) • Stampy HDFS • 1 year • Workflow • Series of (10) map-reduce jobs to join/pre-process construct graph • Neighorhood computation uses Giraph • Output pushed out online as RADD • Issues • 2 days for end-to-end process • Lots of intermediate read/write • Hadoop cluster stability • Unsupported by Hortonworks 6 Hadoop HDFS (Stampy) RADD M/R Jobs M/R Jobs Giraph Jobs Giraph Job Online Offline
  • 7. © 2016 PayPal Inc. Confidential and proprietary. Graph Processing (Future) • Leverage graph representation to understand, analyze, visualize relationships – their strengths, direction & influence and to predict future relationships and interactions between entities (consumers, merchant, IP address, device IDs, nodes in a network…). • Algorithmic convergence of graph theory & machine learning (statistical relations learning & probabilistic graph modeling) • Multi-layered full fledged specialized stack, hardware to software. 7 GPU, FPGA, TPU., SSD (specialized hardware) Graph DB (specialized software) Graph Algorithms (Deterministic & Probabilistic) Graph Analytics & Visualization Graph Modeling (Prediction)
  • 8. © 2016 PayPal Inc. Confidential and proprietary. Graph Processing (Future) • Solve new problems that traditional approaches can not solve: • What is the likelihood that particular consumer (node in graph) colluding (hidden relationship in graph) with a particular merchant (node in graph)? • How fraudsters (series of nodes & edges) move money through our network i.e., anti-money laundering? • If a consumer (node) is buying (relation) product X (node) from merchant (node), what else can we recommend to the consumer? • If a server (node) in a data center goes down/unhealthy, what would would be the impact to other critical components? • Who (what) are the central influential people (assets) in our network? • Graph theory & algorithms (link analysis, page rank, belief propagation, markov random field) can help to uncover influencing nodes & hidden relationships. • Graph databases are much more efficient in unearthing graph properties, have flexible schema that is amenable to changing data, scale naturally to large data sets as they do not require expensive join operations. 8
  • 9. © 2016 PayPal Inc. Confidential and proprietary. Graph Processing POC • POC with new stack • Single powerful machine in-lieu of 1200+ node stampy cluster • 160 core • State-of-the-art NVidia Pascal GPU (4 P100) • 1 TB RAM • 2 TB SSD • Graph DB optimized for scale-up architecture (Neo4j) • Graph computing frameworks optimized for single server/GPUs 9 Hadoop HDFS (Stampy) RADD Graph DB (Neo4J) Online Offline Hardware P100 GPU; SSD Graph Algorithms (Neo4J) Graph Algorithms (GunRock)
  • 10. © 2016 PayPal Inc. Confidential and proprietary. Graph Processing POC • Graph • 1.5 billion edges; 0.5 billion vertices • Static graph (1 year); No time dimension • Vertices - Customer ID, Merchant ID, IP; Edges – payment transaction 10 Degree 2^0: 58104040 (58.10%) Degree 2^1: 23034683 (23.03%) Degree 2^2: 11711920 (11.71%) Degree 2^3: 5098768 (5.10%) Degree 2^4: 1575787 (1.58%) Degree 2^5: 366774 (0.37%) Degree 2^6: 79654 (0.08%) Degree 2^7: 19796 (0.02%) Degree 2^8: 5861 (0.01%) Degree 2^9: 1886 (0.00%) Degree 2^10: 581 (0.00%) Degree 2^11: 179 (0.00%) Degree 2^12: 52 (0.00%) Degree 2^13: 12 (0.00%) Degree 2^14: 4 (0.00%) Degree 2^15: 1 (0.00%) Degree Distribution
  • 11. © 2016 PayPal Inc. Confidential and proprietary. Graph Processing POC • GraphAlgorithms • Page Rank • Connected Components • find sets of connected nodes • Each node is reachable from any other node in the set • Community Detection • Label Propagation – Group nodes with similar properties 11
  • 12. © 2016 PayPal Inc. Confidential and proprietary. Graph Processing POC • Results • Neo4j • GunRock, NvGraph, SNAP – in progress 12 Algorithm Compute Time (ms) Write Time (ms) Page Rank 77 124 Connected Components 156 337 Community Detection 384 361
  • 13. © 2016 PayPal Inc. Confidential and proprietary. Graph Processing - Key Learnings • Scale-up showing promise • 1 year graph ~70GB • Can fit entire graph in 1 machine’s memory • Significant reduction in processing time • ~2 days -> ~2 hours • Compute ~77 milliseconds • Significant value in Graph DB – sub-graph creation, interactive exploration, visualization, savings in computation 13 Source: Lise Getoor, UCSC
  • 14. Machine Learning - Current Research Workflow Hardware - Research, Procure, Provision Modeling Algorithm Research Tool - Evaluate, Benchmar k Data & Variable Collection and Selection Model Training - Developm ent, Performan ce Eval, Spec (offline) Scoring Engine Developm ent (online) Deployme nt - One Box & Load test Auditing - Variable & Score Go Live ©2015 PayPal Inc. Confidential and proprietary. 14 • Manual Development (Spec -> Code) • ML Tool Specific Software Stack • Lack of research infrastructure/self-service tools • Serial pipeline; No early involvement of relevant teams
  • 15. © 2016 PayPal Inc. Confidential and proprietary. Machine Learning - Future Modeling/Prediction Stack • Algorithm • Deep Learning • Offline to Online Learning • Human to Auto Feature Engineering • Model Spec • Language Agnostic Platform agnostic Spec (ex: protobufs) • Tools • Scalable & Flexible (ex: TensorFlow) • Software • Language Agnostic Software Stack • Hardware • Accelerators (GPUs, TPUs,..) 15 SS Hardware CPU + GPU/TPU Model Spec (protobufs) Tools (TensorFlow) Algorithm (Deep Learning) Software (Language Agnostic)
  • 16. © 2016 PayPal Inc. Confidential and proprietary. Conclusions • Graph algorithms & Deep Learning shows promising analytical results • Scale-up architecture shows promise for our use cases • Probabilistic Graph Modeling + Deep Learning => Interpretable Models • Scratched surface; lot left to accomplish 16 Software (Java/JVM) Source: Lise Getoor, UCSC