SlideShare a Scribd company logo
1 of 32
Download to read offline
Juan Salas // Celerative CTO
TODAY
AWS LAMBDA
°
+
x
*
FOR DATA SCIENCE
OUR GOAL FOR TODAY
- What is Amazon Lambda?
- Why is it useful for data scientists?
- How can we get started? (Tutorial)
DISCLAIMER: We’ll prioritize practical usage and intuition (let’s make it fun!)
x
°
+
+
WHAT IS AWS LAMBDA?
°
+
x
*
WHAT IS AWS LAMBDA?
Stateless compute service for
event-driven microservices
HOW DOES AWS LAMBDA WORK?
SUPPORTS:
EXAMPLE: MOBILE BACKEND
EXAMPLE: ETL
EXAMPLE: SOCIAL MEDIA STREAMS
IT’S PRETTY COOL FOR DATA SCIENTISTS
- Supports Python.
- No infrastructure to manage, takes care of:
- Scaling, monitoring, deployments, sys updates, etc.
- High availability and scalability by default
- Easy parallelization
- Fine grained pricing. Don’t pay for idle!
THERE ARE SOME DRAWBACKS
- You give up some control (e.g. GPU, threading, filesystem, etc.).
- Dependencies not available with pip can be tricky.
- Package size constraints:
- Scikit-learn depends on numpy and scipy, which in turn require
C and Fortran (!!!) libraries. The package is too big.
(We’ll see how to overcome this…)
x
°
+
+
LET’S GET STARTED!
TAKING LAMBDA FOR A RIDE
We will:
1. Deploy a basic API.
2. Collect some data from the Internet.
3. Use sklearn to make predictions.
CHALICE FRAMEWORK
Python Serverless Microframework for AWS.
Makes our life easier:
- CLI for deployment and management
- Easy interface to create APIs
- Automatic IAM policy generation
- Easy to maintain code structure
More info:
- https://github.com/aws/chalice
- http://chalice.readthedocs.io/en/latest/
Also check out:
- Zappa (Flask)
- Serverless (Node.js)
OPTIONAL: SETTING UP CREDENTIALS
Set up your AWS credentials (skip if you already installed AWS cli or
used boto):
$ mkdir ~/.aws
$ cat >> ~/.aws/config
[default]
aws_access_key_id=YOUR_ACCESS_KEY_HERE
aws_secret_access_key=YOUR_SECRET_ACCESS_KEY
region=YOUR_REGION (such as us-west-2, us-west-1, etc)
FIRST OF ALL...
Install Chalice and create a new project:
$ pip install chalice
$ chalice new-project demo && cd demo
IN TERMINAL:
$ chalice deploy
Creating role: demo-dev
Creating deployment package.
Creating lambda function: demo-dev
Initiating first time deployment.
Deploying to API Gateway stage: api
https://[…].amazonaws.com/api/
HELLO WORLD!
APP.PY:
from chalice import Chalice
app = Chalice(app_name='demo')
@app.route('/')
def index():
return {'hello': 'world'}
CONFIG.JSON:
{
"version": "1.0",
"app_name": "demo",
"stages": {
"dev": {
"api_gateway_stage": "api"
}
}
}
HELLO WORLD!
$ http GET https://[…].amazonaws.com/api/
{
“hello”: “world”
}
Note:
HTTPie is pretty cool
https://httpie.org/
HELLO WORLD!
Also:
- Environment variables
- Logs
- Versoning
- Memory resources
- Timeouts
x
°
+
+
DATA COLLECTION
import logging
import requests
# We schedule this function to run every minute
@app.schedule('rate(1 minute)')
def log_ip(event):
r = requests.get('http://httpbin.org/ip').json()
# We will use the logs, but we could save in a DB.
app.log.debug("IP: " + r['origin'])
return {"status": "ok"}
DATA COLLECTION
EASTER EGG: Ask me about my capstone project!
DATA COLLECTION
x
°
+
+
DEPLOYING A MODEL
GENERATING PREDICTIONS
Let’s say we trained a ML model offline:
sklearn’s digits dataset:
1797 8x8 handwritten digits
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
import pickle
digits = datasets.load_digits()
X, y = digits.data, digits.target
X_train, X_test, y_train, y_test = train_test_split(X,y)
model = LogisticRegression()
model.fit(X_train, y_train)
pickle.dump(model, open("model.pkl", 'wb'))
USING SKLEARN WITH LAMBDA
Remember the Scikit-learn package size?
Docker to the rescue!
- Option 1: Create a container in ECS and send requests
from Lambda (recommended):
See: https://hub.docker.com/r/frolvlad/alpine-python-machinelearning/
- Option 2: Use Amazon Linux container to build a smaller
version of sklearn.
- See: https://github.com/ryansb/sklearn-build-lambda
WHAT IF MY MODEL IS TOO BIG?
We can put our model in a S3 bucket and load it in the function using boto3:
import boto3
import pickle
S3 = boto3.client('s3', region_name='us-east-1')
BUCKET = 'bucket-name'
KEY = 'key-name'
response = S3.get_object(Bucket=BUCKET, Key=KEY)
body_string = response['Body'].read()
model = pickle.loads(body_string)
x
°
+
+
RECAP
RECAP
- We learnt what AWS Lambda is and what it’s not.
- We used Lambda with sklearn to make predictions.
- We worked with pre-trained models on S3.
- We collected data using scheduled tasks.
GitHub: https://github.com/celerative/aws_lambda_for_data_science
x
}
+
*
°
THANK YOU!
44 Tehama St. / San Francisco / CA
us@celerative.com
celerative.com <<
x
42 nº 1389 / La Plata / Arg.
info@celerative.com
+54 (011) 527 56155
+juan.salas@celerative.com
jmsalas
APPENDIX: USING SKLEARN WITH CHALICE
1. Get the Amazon Linux Docker image:
$ docker pull amazonlinux:2016.09
2. Clone a script that will build sklearn for us:
$ git clone https://github.com/ryansb/sklearn-build-lambda
$ cd sklearn-build-lambda
3. Run the Amazon Linux container and use it to run the script:
$ docker run -v $(pwd):/outputs -it amazonlinux:2016.09 
/bin/bash /outputs/build.sh
4. Unzip the contents of the resulting venv.zip in the vendor directory.
APPENDIX: USING SKLEARN WITH CHALICE
Add to app.py:
import os
import ctypes
# Use ctypes to support C data types, required for sklearn and numpy
for d, _, files in os.walk('lib'):
for f in files:
if f.endswith('.a'):
continue
ctypes.cdll.LoadLibrary(os.path.join(d, f))
import sklearn
APPENDIX: USING SKLEARN WITH CHALICE
Now we have:
- requirements.txt: pip dependencies except sklearn
- chalicelib/model.pkl: our pre-trained model
- vendor/[venv.zip/*]: our compressed scikit-learn package
Everything we put in the chalicelib and vendor directories will be recursively
included in the deployment package (source code, json, binaries, etc.).
General rule: chalicelib for our own code and vendor for third-parties.
APPENDIX: USING SKLEARN WITH CHALICE

More Related Content

What's hot

How Ansible Makes Automation Easy
How Ansible Makes Automation EasyHow Ansible Makes Automation Easy
How Ansible Makes Automation EasyPeter Sankauskas
 
Automating aws infrastructure and code deployments using Ansible @WebEngage
Automating aws infrastructure and code deployments using Ansible @WebEngageAutomating aws infrastructure and code deployments using Ansible @WebEngage
Automating aws infrastructure and code deployments using Ansible @WebEngageVishal Uderani
 
(DEV201) AWS SDK For Go: Gophers Get Going with AWS
(DEV201) AWS SDK For Go: Gophers Get Going with AWS(DEV201) AWS SDK For Go: Gophers Get Going with AWS
(DEV201) AWS SDK For Go: Gophers Get Going with AWSAmazon Web Services
 
Terraform and cloud.ca
Terraform and cloud.caTerraform and cloud.ca
Terraform and cloud.caCloudOps2005
 
Setting up Kubernetes with tectonic
Setting up Kubernetes with tectonicSetting up Kubernetes with tectonic
Setting up Kubernetes with tectonicVishal Biyani
 
Building infrastructure as code using Terraform - DevOps Krakow
Building infrastructure as code using Terraform - DevOps KrakowBuilding infrastructure as code using Terraform - DevOps Krakow
Building infrastructure as code using Terraform - DevOps KrakowAnton Babenko
 
How to test infrastructure code: automated testing for Terraform, Kubernetes,...
How to test infrastructure code: automated testing for Terraform, Kubernetes,...How to test infrastructure code: automated testing for Terraform, Kubernetes,...
How to test infrastructure code: automated testing for Terraform, Kubernetes,...Yevgeniy Brikman
 
(APP202) Deploy, Manage, and Scale Your Apps with AWS OpsWorks and AWS Elasti...
(APP202) Deploy, Manage, and Scale Your Apps with AWS OpsWorks and AWS Elasti...(APP202) Deploy, Manage, and Scale Your Apps with AWS OpsWorks and AWS Elasti...
(APP202) Deploy, Manage, and Scale Your Apps with AWS OpsWorks and AWS Elasti...Amazon Web Services
 
Ansible 2.0 - How to use Ansible to automate your applications in AWS.
Ansible 2.0 - How to use Ansible to automate your applications in AWS.Ansible 2.0 - How to use Ansible to automate your applications in AWS.
Ansible 2.0 - How to use Ansible to automate your applications in AWS.Idan Tohami
 
Scaling Django for X Factor - DJUGL Oct 2012
Scaling Django for X Factor - DJUGL Oct 2012Scaling Django for X Factor - DJUGL Oct 2012
Scaling Django for X Factor - DJUGL Oct 2012Malcolm Box
 
Phoenix for Rails Devs
Phoenix for Rails DevsPhoenix for Rails Devs
Phoenix for Rails DevsDiacode
 
Ansible on aws - Pop-up Loft Tel Aviv
Ansible on aws - Pop-up Loft Tel AvivAnsible on aws - Pop-up Loft Tel Aviv
Ansible on aws - Pop-up Loft Tel AvivAmazon Web Services
 
Docker worshop @Twitter - How to use your own private registry
Docker worshop @Twitter - How to use your own private registryDocker worshop @Twitter - How to use your own private registry
Docker worshop @Twitter - How to use your own private registrydotCloud
 
Hosting a Rails App
Hosting a Rails AppHosting a Rails App
Hosting a Rails AppJosh Schramm
 
DevOps for Humans - Ansible for Drupal Deployment Victory!
DevOps for Humans - Ansible for Drupal Deployment Victory!DevOps for Humans - Ansible for Drupal Deployment Victory!
DevOps for Humans - Ansible for Drupal Deployment Victory!Jeff Geerling
 
Testing Ansible with Jenkins and Docker
Testing Ansible with Jenkins and DockerTesting Ansible with Jenkins and Docker
Testing Ansible with Jenkins and DockerDennis Rowe
 

What's hot (19)

How Ansible Makes Automation Easy
How Ansible Makes Automation EasyHow Ansible Makes Automation Easy
How Ansible Makes Automation Easy
 
Automating aws infrastructure and code deployments using Ansible @WebEngage
Automating aws infrastructure and code deployments using Ansible @WebEngageAutomating aws infrastructure and code deployments using Ansible @WebEngage
Automating aws infrastructure and code deployments using Ansible @WebEngage
 
Zabbix aws
Zabbix awsZabbix aws
Zabbix aws
 
(DEV201) AWS SDK For Go: Gophers Get Going with AWS
(DEV201) AWS SDK For Go: Gophers Get Going with AWS(DEV201) AWS SDK For Go: Gophers Get Going with AWS
(DEV201) AWS SDK For Go: Gophers Get Going with AWS
 
Terraform and cloud.ca
Terraform and cloud.caTerraform and cloud.ca
Terraform and cloud.ca
 
Setting up Kubernetes with tectonic
Setting up Kubernetes with tectonicSetting up Kubernetes with tectonic
Setting up Kubernetes with tectonic
 
Building infrastructure as code using Terraform - DevOps Krakow
Building infrastructure as code using Terraform - DevOps KrakowBuilding infrastructure as code using Terraform - DevOps Krakow
Building infrastructure as code using Terraform - DevOps Krakow
 
How to test infrastructure code: automated testing for Terraform, Kubernetes,...
How to test infrastructure code: automated testing for Terraform, Kubernetes,...How to test infrastructure code: automated testing for Terraform, Kubernetes,...
How to test infrastructure code: automated testing for Terraform, Kubernetes,...
 
(APP202) Deploy, Manage, and Scale Your Apps with AWS OpsWorks and AWS Elasti...
(APP202) Deploy, Manage, and Scale Your Apps with AWS OpsWorks and AWS Elasti...(APP202) Deploy, Manage, and Scale Your Apps with AWS OpsWorks and AWS Elasti...
(APP202) Deploy, Manage, and Scale Your Apps with AWS OpsWorks and AWS Elasti...
 
Ansible 2.0 - How to use Ansible to automate your applications in AWS.
Ansible 2.0 - How to use Ansible to automate your applications in AWS.Ansible 2.0 - How to use Ansible to automate your applications in AWS.
Ansible 2.0 - How to use Ansible to automate your applications in AWS.
 
Scaling Django for X Factor - DJUGL Oct 2012
Scaling Django for X Factor - DJUGL Oct 2012Scaling Django for X Factor - DJUGL Oct 2012
Scaling Django for X Factor - DJUGL Oct 2012
 
Phoenix for Rails Devs
Phoenix for Rails DevsPhoenix for Rails Devs
Phoenix for Rails Devs
 
Ansible on aws - Pop-up Loft Tel Aviv
Ansible on aws - Pop-up Loft Tel AvivAnsible on aws - Pop-up Loft Tel Aviv
Ansible on aws - Pop-up Loft Tel Aviv
 
Ansible Case Studies
Ansible Case StudiesAnsible Case Studies
Ansible Case Studies
 
Docker worshop @Twitter - How to use your own private registry
Docker worshop @Twitter - How to use your own private registryDocker worshop @Twitter - How to use your own private registry
Docker worshop @Twitter - How to use your own private registry
 
Hosting a Rails App
Hosting a Rails AppHosting a Rails App
Hosting a Rails App
 
Cyansible
CyansibleCyansible
Cyansible
 
DevOps for Humans - Ansible for Drupal Deployment Victory!
DevOps for Humans - Ansible for Drupal Deployment Victory!DevOps for Humans - Ansible for Drupal Deployment Victory!
DevOps for Humans - Ansible for Drupal Deployment Victory!
 
Testing Ansible with Jenkins and Docker
Testing Ansible with Jenkins and DockerTesting Ansible with Jenkins and Docker
Testing Ansible with Jenkins and Docker
 

Similar to AWS Lambda for Data Science @Celerative

Squeezing Machine Learning into Serverless for Image Recognition - AWS Meetup...
Squeezing Machine Learning into Serverless for Image Recognition - AWS Meetup...Squeezing Machine Learning into Serverless for Image Recognition - AWS Meetup...
Squeezing Machine Learning into Serverless for Image Recognition - AWS Meetup...Chris Shenton
 
Serverless cat detector workshop - cloudyna 2017 (16.12.2017)
Serverless cat detector   workshop - cloudyna 2017 (16.12.2017)Serverless cat detector   workshop - cloudyna 2017 (16.12.2017)
Serverless cat detector workshop - cloudyna 2017 (16.12.2017)Paweł Pikuła
 
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCampRichard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCampBigDataCamp
 
Python in the serverless era (PyCon 2017)
Python in the serverless era (PyCon 2017)Python in the serverless era (PyCon 2017)
Python in the serverless era (PyCon 2017)Benny Bauer
 
Amazon Web Services for PHP Developers
Amazon Web Services for PHP DevelopersAmazon Web Services for PHP Developers
Amazon Web Services for PHP DevelopersJeremy Lindblom
 
Serverless Framework Workshop - Tyler Hendrickson, Chicago/burbs
 Serverless Framework Workshop - Tyler Hendrickson, Chicago/burbs Serverless Framework Workshop - Tyler Hendrickson, Chicago/burbs
Serverless Framework Workshop - Tyler Hendrickson, Chicago/burbsAWS Chicago
 
Just one-shade-of-openstack
Just one-shade-of-openstackJust one-shade-of-openstack
Just one-shade-of-openstackRoberto Polli
 
Alex Casalboni - Configuration management and service discovery - Codemotion ...
Alex Casalboni - Configuration management and service discovery - Codemotion ...Alex Casalboni - Configuration management and service discovery - Codemotion ...
Alex Casalboni - Configuration management and service discovery - Codemotion ...Codemotion
 
DevOps Fest 2019. Alex Casalboni. Configuration management and service discov...
DevOps Fest 2019. Alex Casalboni. Configuration management and service discov...DevOps Fest 2019. Alex Casalboni. Configuration management and service discov...
DevOps Fest 2019. Alex Casalboni. Configuration management and service discov...DevOps_Fest
 
AWS Lambda with Serverless Framework and Java
AWS Lambda with Serverless Framework and JavaAWS Lambda with Serverless Framework and Java
AWS Lambda with Serverless Framework and JavaManish Pandit
 
Serverless in production, an experience report (IWOMM)
Serverless in production, an experience report (IWOMM)Serverless in production, an experience report (IWOMM)
Serverless in production, an experience report (IWOMM)Yan Cui
 
10 things I’ve learnt In the clouds
10 things I’ve learnt In the clouds10 things I’ve learnt In the clouds
10 things I’ve learnt In the cloudsStuart Lodge
 
An Automated Laser Pointer for Your Dog : Aws IoT & Lambda
An Automated Laser Pointer for Your Dog : Aws IoT & Lambda An Automated Laser Pointer for Your Dog : Aws IoT & Lambda
An Automated Laser Pointer for Your Dog : Aws IoT & Lambda Lisa Kester
 
Introduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David NalleyIntroduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David Nalleybuildacloud
 
Serverless in production, an experience report (linuxing in london)
Serverless in production, an experience report (linuxing in london)Serverless in production, an experience report (linuxing in london)
Serverless in production, an experience report (linuxing in london)Yan Cui
 
Serverless in production, an experience report (Going Serverless)
Serverless in production, an experience report (Going Serverless)Serverless in production, an experience report (Going Serverless)
Serverless in production, an experience report (Going Serverless)Yan Cui
 
Puppetpreso
PuppetpresoPuppetpreso
Puppetpresoke4qqq
 
Cloud Computing in PHP With the Amazon Web Services
Cloud Computing in PHP With the Amazon Web ServicesCloud Computing in PHP With the Amazon Web Services
Cloud Computing in PHP With the Amazon Web ServicesAmazon Web Services
 
BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...
BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...
BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...Ron Reiter
 

Similar to AWS Lambda for Data Science @Celerative (20)

Squeezing Machine Learning into Serverless for Image Recognition - AWS Meetup...
Squeezing Machine Learning into Serverless for Image Recognition - AWS Meetup...Squeezing Machine Learning into Serverless for Image Recognition - AWS Meetup...
Squeezing Machine Learning into Serverless for Image Recognition - AWS Meetup...
 
Serverless cat detector workshop - cloudyna 2017 (16.12.2017)
Serverless cat detector   workshop - cloudyna 2017 (16.12.2017)Serverless cat detector   workshop - cloudyna 2017 (16.12.2017)
Serverless cat detector workshop - cloudyna 2017 (16.12.2017)
 
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCampRichard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
 
AWS Serverless Workshop
AWS Serverless WorkshopAWS Serverless Workshop
AWS Serverless Workshop
 
Python in the serverless era (PyCon 2017)
Python in the serverless era (PyCon 2017)Python in the serverless era (PyCon 2017)
Python in the serverless era (PyCon 2017)
 
Amazon Web Services for PHP Developers
Amazon Web Services for PHP DevelopersAmazon Web Services for PHP Developers
Amazon Web Services for PHP Developers
 
Serverless Framework Workshop - Tyler Hendrickson, Chicago/burbs
 Serverless Framework Workshop - Tyler Hendrickson, Chicago/burbs Serverless Framework Workshop - Tyler Hendrickson, Chicago/burbs
Serverless Framework Workshop - Tyler Hendrickson, Chicago/burbs
 
Just one-shade-of-openstack
Just one-shade-of-openstackJust one-shade-of-openstack
Just one-shade-of-openstack
 
Alex Casalboni - Configuration management and service discovery - Codemotion ...
Alex Casalboni - Configuration management and service discovery - Codemotion ...Alex Casalboni - Configuration management and service discovery - Codemotion ...
Alex Casalboni - Configuration management and service discovery - Codemotion ...
 
DevOps Fest 2019. Alex Casalboni. Configuration management and service discov...
DevOps Fest 2019. Alex Casalboni. Configuration management and service discov...DevOps Fest 2019. Alex Casalboni. Configuration management and service discov...
DevOps Fest 2019. Alex Casalboni. Configuration management and service discov...
 
AWS Lambda with Serverless Framework and Java
AWS Lambda with Serverless Framework and JavaAWS Lambda with Serverless Framework and Java
AWS Lambda with Serverless Framework and Java
 
Serverless in production, an experience report (IWOMM)
Serverless in production, an experience report (IWOMM)Serverless in production, an experience report (IWOMM)
Serverless in production, an experience report (IWOMM)
 
10 things I’ve learnt In the clouds
10 things I’ve learnt In the clouds10 things I’ve learnt In the clouds
10 things I’ve learnt In the clouds
 
An Automated Laser Pointer for Your Dog : Aws IoT & Lambda
An Automated Laser Pointer for Your Dog : Aws IoT & Lambda An Automated Laser Pointer for Your Dog : Aws IoT & Lambda
An Automated Laser Pointer for Your Dog : Aws IoT & Lambda
 
Introduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David NalleyIntroduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David Nalley
 
Serverless in production, an experience report (linuxing in london)
Serverless in production, an experience report (linuxing in london)Serverless in production, an experience report (linuxing in london)
Serverless in production, an experience report (linuxing in london)
 
Serverless in production, an experience report (Going Serverless)
Serverless in production, an experience report (Going Serverless)Serverless in production, an experience report (Going Serverless)
Serverless in production, an experience report (Going Serverless)
 
Puppetpreso
PuppetpresoPuppetpreso
Puppetpreso
 
Cloud Computing in PHP With the Amazon Web Services
Cloud Computing in PHP With the Amazon Web ServicesCloud Computing in PHP With the Amazon Web Services
Cloud Computing in PHP With the Amazon Web Services
 
BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...
BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...
BDX 2015 - Scaling out big-data computation & machine learning using Pig, Pyt...
 

Recently uploaded

Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 

Recently uploaded (20)

Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 

AWS Lambda for Data Science @Celerative

  • 1. Juan Salas // Celerative CTO TODAY AWS LAMBDA ° + x * FOR DATA SCIENCE
  • 2. OUR GOAL FOR TODAY - What is Amazon Lambda? - Why is it useful for data scientists? - How can we get started? (Tutorial) DISCLAIMER: We’ll prioritize practical usage and intuition (let’s make it fun!)
  • 4. ° + x * WHAT IS AWS LAMBDA? Stateless compute service for event-driven microservices
  • 5. HOW DOES AWS LAMBDA WORK? SUPPORTS:
  • 9. IT’S PRETTY COOL FOR DATA SCIENTISTS - Supports Python. - No infrastructure to manage, takes care of: - Scaling, monitoring, deployments, sys updates, etc. - High availability and scalability by default - Easy parallelization - Fine grained pricing. Don’t pay for idle!
  • 10. THERE ARE SOME DRAWBACKS - You give up some control (e.g. GPU, threading, filesystem, etc.). - Dependencies not available with pip can be tricky. - Package size constraints: - Scikit-learn depends on numpy and scipy, which in turn require C and Fortran (!!!) libraries. The package is too big. (We’ll see how to overcome this…)
  • 12. TAKING LAMBDA FOR A RIDE We will: 1. Deploy a basic API. 2. Collect some data from the Internet. 3. Use sklearn to make predictions.
  • 13. CHALICE FRAMEWORK Python Serverless Microframework for AWS. Makes our life easier: - CLI for deployment and management - Easy interface to create APIs - Automatic IAM policy generation - Easy to maintain code structure More info: - https://github.com/aws/chalice - http://chalice.readthedocs.io/en/latest/ Also check out: - Zappa (Flask) - Serverless (Node.js)
  • 14. OPTIONAL: SETTING UP CREDENTIALS Set up your AWS credentials (skip if you already installed AWS cli or used boto): $ mkdir ~/.aws $ cat >> ~/.aws/config [default] aws_access_key_id=YOUR_ACCESS_KEY_HERE aws_secret_access_key=YOUR_SECRET_ACCESS_KEY region=YOUR_REGION (such as us-west-2, us-west-1, etc)
  • 15. FIRST OF ALL... Install Chalice and create a new project: $ pip install chalice $ chalice new-project demo && cd demo
  • 16. IN TERMINAL: $ chalice deploy Creating role: demo-dev Creating deployment package. Creating lambda function: demo-dev Initiating first time deployment. Deploying to API Gateway stage: api https://[…].amazonaws.com/api/ HELLO WORLD! APP.PY: from chalice import Chalice app = Chalice(app_name='demo') @app.route('/') def index(): return {'hello': 'world'} CONFIG.JSON: { "version": "1.0", "app_name": "demo", "stages": { "dev": { "api_gateway_stage": "api" } } }
  • 17. HELLO WORLD! $ http GET https://[…].amazonaws.com/api/ { “hello”: “world” } Note: HTTPie is pretty cool https://httpie.org/
  • 18. HELLO WORLD! Also: - Environment variables - Logs - Versoning - Memory resources - Timeouts
  • 20. import logging import requests # We schedule this function to run every minute @app.schedule('rate(1 minute)') def log_ip(event): r = requests.get('http://httpbin.org/ip').json() # We will use the logs, but we could save in a DB. app.log.debug("IP: " + r['origin']) return {"status": "ok"} DATA COLLECTION EASTER EGG: Ask me about my capstone project!
  • 23. GENERATING PREDICTIONS Let’s say we trained a ML model offline: sklearn’s digits dataset: 1797 8x8 handwritten digits from sklearn import datasets from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression import pickle digits = datasets.load_digits() X, y = digits.data, digits.target X_train, X_test, y_train, y_test = train_test_split(X,y) model = LogisticRegression() model.fit(X_train, y_train) pickle.dump(model, open("model.pkl", 'wb'))
  • 24. USING SKLEARN WITH LAMBDA Remember the Scikit-learn package size? Docker to the rescue! - Option 1: Create a container in ECS and send requests from Lambda (recommended): See: https://hub.docker.com/r/frolvlad/alpine-python-machinelearning/ - Option 2: Use Amazon Linux container to build a smaller version of sklearn. - See: https://github.com/ryansb/sklearn-build-lambda
  • 25. WHAT IF MY MODEL IS TOO BIG? We can put our model in a S3 bucket and load it in the function using boto3: import boto3 import pickle S3 = boto3.client('s3', region_name='us-east-1') BUCKET = 'bucket-name' KEY = 'key-name' response = S3.get_object(Bucket=BUCKET, Key=KEY) body_string = response['Body'].read() model = pickle.loads(body_string)
  • 27. RECAP - We learnt what AWS Lambda is and what it’s not. - We used Lambda with sklearn to make predictions. - We worked with pre-trained models on S3. - We collected data using scheduled tasks. GitHub: https://github.com/celerative/aws_lambda_for_data_science
  • 28. x } + * ° THANK YOU! 44 Tehama St. / San Francisco / CA us@celerative.com celerative.com << x 42 nº 1389 / La Plata / Arg. info@celerative.com +54 (011) 527 56155 +juan.salas@celerative.com jmsalas
  • 29. APPENDIX: USING SKLEARN WITH CHALICE 1. Get the Amazon Linux Docker image: $ docker pull amazonlinux:2016.09 2. Clone a script that will build sklearn for us: $ git clone https://github.com/ryansb/sklearn-build-lambda $ cd sklearn-build-lambda 3. Run the Amazon Linux container and use it to run the script: $ docker run -v $(pwd):/outputs -it amazonlinux:2016.09 /bin/bash /outputs/build.sh 4. Unzip the contents of the resulting venv.zip in the vendor directory.
  • 30. APPENDIX: USING SKLEARN WITH CHALICE Add to app.py: import os import ctypes # Use ctypes to support C data types, required for sklearn and numpy for d, _, files in os.walk('lib'): for f in files: if f.endswith('.a'): continue ctypes.cdll.LoadLibrary(os.path.join(d, f)) import sklearn
  • 31. APPENDIX: USING SKLEARN WITH CHALICE Now we have: - requirements.txt: pip dependencies except sklearn - chalicelib/model.pkl: our pre-trained model - vendor/[venv.zip/*]: our compressed scikit-learn package Everything we put in the chalicelib and vendor directories will be recursively included in the deployment package (source code, json, binaries, etc.). General rule: chalicelib for our own code and vendor for third-parties.
  • 32. APPENDIX: USING SKLEARN WITH CHALICE