LF_APIStrat17_Diving Deep into the API Ocean with Open Source Deep Learning Tools

Diving Deep into the API
Ocean with Open Source
Deep Learning Tools
Paul M. Cray, APImetrics

Who are APImetrics?
Seattle-based startup
Blue chip clients include banks, fintech, carriers, utilities
and vehicle IoT
• APImetrics makes individual or sequences of functional API calls
• Synthetic test calls can be scheduled to be made from any location
in any of the 4 main clouds (AWS, Azure, Google, IBM)
• Codebase written in Python with JavaScript for UI
• Data is analyzed using ML and AI functionality we are developing
using open source tools

Who does APImetrics do?
• to manage your APIs you need to understand how they actually behave
from the end-user’s perspective in the real world
• APImetrics is an API performance and quality monitoring system running as
Software-as-a-Service on Google App Engine
• we provide wizards that allow users to create authentications, test calls and
workflows (back-to-back calls) easily
• test calls can deployed to more than 60 cloud locations on four continents to
make scheduled calls to exercise API endpoints
• we support our own API to facilitate deep integration into higher-level
management systems

What does APImetrics look like?

APImetrics 4.7TB historical dataset
• Over 400M API call records made from multiple clouds and locations
• We retain retained all data associated with each call including
payload to give complete picture of API performance
– Timestamp of call
– API endpoint
– Call cloud location
– HTTP response code
– Payload
– Latency breakdown times
• DNS lookup, Connect, Handshake, Upload, Processing, Download

APImetrics Insights CASC score
• What metric do you use measure API performance?
– Latency? Availability? Pass rate?
• Too many variables to compare and contrast API quality easily
• APImetrics use our own magic sauce to combine metrics into a
single blended credit rating-like score
• CASC score allows at-a-glance like-to-like comparison and trend
analysis of the performance and quality of different API calls
• CASC scores are currently calculated on a weekly and monthly
basis, but daily scores coming soon

Typical APImetrics Insights CASC scores

The CASC score and Machine Learning
CHALLENGE: How do we calculate CASC scores in real time? What
do we need?
• More robust (patent application in progress) method for calculating
CASC score that leverages our unrivalled historical dataset
• Uses supervised learning with linear regression used to calculate
CASC parameters
• Python scikit-learn package also numpy, pandas, scipy and
statsmodel used in APImetrics Insights

It’s 2017. How about a
neural net?

The components to be looked
• Outlier detection
• Handling multimodality
• Identifying clusters of related events
• Anomaly detection

Outlier detection
• Historically:
– Heuristic designated a record an outlier if overall latency exceeded a certain
number of standard deviations from the mean
• Outlier detection is a visual problem
– We can see (some/most of) the outliers by eye
• How to use deep learning techniques to detect outliers?
– Implement Recurrent Neural Net (RNN) to analyze time series data?
– Implement Convolutional Neural Net (CNN) to recognise outlier patterns?
– Use PyTorch as it is emerging as the leading Deep Learning framework and
supports idiomatic Python approach

Multimodality detection
• Latency distribution are typically neither unimodal nor normal
• Outlier detection heuristics relying on latencies being so are flawed
• Reliable outlier detection must first determine modality
• Easy by eye, but sensitive to binning
– Use a CNN to detect modality?
– Use a clustering algorithm to assign modality?
– How to handle binning problem?

Cluster detection
• Currently using a heuristic to construct clusters of outliers
– Much too simplistic
• Exploring algorithms like k-means implemented in a package such
as scikit-learn
• But a result is more like to be an outlier if it is close to other outliers,
i.e. if it is in a cluster
• We believe outlier and cluster detection should be done
simultaneously
– Investigating if an RNN can identify whether a record is an outlier and whether it
belongs to a cluster

APIs and AIPIs
• APImetrics has 4.7TB of (semi-)structured data packed with
actionable intelligence
– If we can discover it
• We know what we can look for, but what is hidden in the data
ocean?
• An experienced API support engineer can extrapolate from an issue
with one API to an similar issue with a completely different API
– Ultimate goal is a domain-specific AI that does this automagically: an Artificially
Intelligence Programming Interface (AIPI) that can capture, generate and
manipulate API-related knowledge

LF_APIStrat17_Diving Deep into the API Ocean with Open Source Deep Learning Tools

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to LF_APIStrat17_Diving Deep into the API Ocean with Open Source Deep Learning Tools

Similar to LF_APIStrat17_Diving Deep into the API Ocean with Open Source Deep Learning Tools (20)

More from LF_APIStrat

More from LF_APIStrat (20)

Recently uploaded

Recently uploaded (20)

LF_APIStrat17_Diving Deep into the API Ocean with Open Source Deep Learning Tools