See how Ian Whitestone (Data Scientist at Shopify) created Domi – #Toronto Apartment Finder app, using #Serverless Framework #Zappa for #Python on #AWS, #PostGIS, #Slack, and some #Regression Techniques: https://www.youtube.com/watch?v=JE_zEqe7M_8
http://ServerlessToronto.org thanks https://www.linkedin.com/company/trend-micro for catering, https://www.linkedin.com/company/myplanethq for hosting, and https://www.linkedin.com/company/manning-publications-co for book giveaways!
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Find an apartment in Toronto with serverless Python
1. Thursday, Feb 27, 2020
1. Intro & Activity Update
2. Community Open Mic
3. Ian Whitestone: “Bootstrapping a
data-driven application with
Zappa (Serverless Python) to
find an apartment in Toronto”
4. Networking
1
ServerlessToronto.org Meetup Agenda
2. Serverless is not just about the Tech:
2
Serverless is New Agile & Mindset
Serverless Dev (gluing
other people’s APIs and
managed services)
We're obsessed by
creating business value
(meaningful MVPs,
products) and helping
Startups
We build bridges
between Serverless
Community (“Dev leg”),
and Front-end & Voice-
First folks (“UX leg”),
and empower UX
developers
Achieve agility NOT by
“sprinting” faster (like in
Scrum), but by working
smarter (by using
bigger building blocks
and less Ops)
4. Venue Sponsor
4
As Certified B Corporation, Myplanet is purpose-driven and
creates benefit for all stakeholders, not just shareholders!
5. Knowledge Sponsor
5
Get your raffle tickets… and GOOD LUCK!
1. Go to www.manning.com
2. Select eBook or Video title you like
3. Add it to the shopping cart
4. Raffle winner will send me email address used
5. For Manning staff to move to your dashboard
6. Bonus Raffle from our friends
6
Get your raffle tickets… and GOOD LUCK!
2 tickets for “Full-Day on JAMstack Web Technology” paid event.
13. Upcoming 2020 #ServerlessTO Meetups
2
1. Intro to PySpark – Python Data Analysis at scale in the Cloud –
Jonathan Rioux, Lead Data Scientist at EPAM Systems & author of
PySpark in Action book ** MARCH 19 **
2. Introduction to Google BigQuery – Matt Welke, Software
Developer at GroupBy Inc
3. Solving your Business Problems with Serverless Architectures
– Panel discussion ** BACK BY POPULAR DEMAND **
4. Serverless with Pivotal Cloud Foundry – Adib Saikali, Principal
Platform Architect at VMware
5. Fivetran – Data Pipelines, Reinvented – Replicate your data into
the Cloud Warehouse of your choice
6. Your Own Presentation – PLEASE VOLUNTEER ☺
14. Community Open Mic
3
Your 10 sec. pitch ☺
- Looking for work?
- Offering work?
About You – because without you, there would be no meetups!
36. handler.py
import requests
import yaml
import main
def my_handler(event=None, context=None):
"""Kick off the desired function
Parameters
----------
event : dict, optional
AWS Lambda uses this parameter to pass in event data to the handler
context : LambdaContext, optional
AWS Lambda uses this parameter to provide runtime information
to your handler
"""
main.do_stuff() # and things
44. # Give it full access to S3
→ aws iam attach-role-policy
--role-name lambda_basic_role
--policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess
# And cloudwatch (logs)
→ aws iam attach-role-policy
--role-name lambda_basic_role
--policy-arn arn:aws:iam::aws:policy/CloudWatchFullAccess
45. Step 3: Create Lambda FunctionStep 3: Create Lambda Function
70. Easilyviewlogs
# Show logs from specific timeframe
→ zappa tail dev --since 1m
# Show logs from specific timeframe and filter
→ zappa tail batch_secondary_us_east_1 --since 1d --filter "ERROR"
71. Invokerawcommandsonlambdafortesting(avoidre-deploying)
→ zappa invoke dev "import psycopg2; print('hello')" --raw
Calling invoke for stage dev..
[START] RequestId: e35516da-b71d-4452-9896-e622fe263d1f Version: $LATEST
Instancing..
[DEBUG] 2019-09-22T20:20:09.25Z e622fe263d1f Zappa Event:
{'raw_command': "import psycopg2; print('hello')"}
hello
[END] RequestId: e35516da-b71d-4452-9896-e622fe263d1f
[REPORT] RequestId: e35516da-b71d-4452-9896-e622fe263d1f
Duration: 198.44 ms
Billed Duration: 200 ms
Memory Size: 512 MB
Max Memory Used: 84 MB
Init Duration: 525.29 ms
74. Many more features..Many more features..
ExecuteinresponsetootherAWSevents
Easyrollbackswithzappa rollback prod -n 1
Easyinfrateardownwithzappa undeploy prod
Extensibilitythroughcustomcallbacks
Seeexample
andmore...
here
86. Run fast, powerful spatial queriesRun fast, powerful spatial queries
SELECT listings.*
FROM listings, user_regions
WHERE
ST_Contains(user_regions.geom, listings.geom)
AND bedrooms >= 1
AND bathrooms >= 1
AND ...
87. from geoalchemy2 import Geometry
from sqlalchemy import Column, Integer
class Listing(BASE):
__tablename__ = "listings"
id = Column(Integer, primary_key=True)
geom = Column(Geometry(geometry_type="POINT", srid=4326))
bedrooms = Column(Integer)
class UserRegion(BASE):
__tablename__ = "user_regions"
id = Column(Integer, primary_key=True)
user_id = Column(Integer, ForeignKey("users.id"))
geom = Column(Geometry(geometry_type="POLYGON", srid=4326))
127. Enter Great ExpectationsEnter Great Expectations
Frameworkforwritingtestsfordata
validatedatasetagainstour"expectations"
Runchecksondatasets:
atbeginning/endofpipelines
beforefeedingintomachinelearningmodel
periodically,on-schedule
github.com/great-expectations
128. Types of ExpectationsTypes of Expectations
expect_column_values_to_not_be_null
expect_column_values_to_match_regex
expect_column_values_to_be_unique
expect_column_values_to_match_strftime_format
expect_table_row_count_to_be_between
expect_column_median_to_be_between
...andmanymore
131. run_data_checks.py
from domi.db import DB_ENGINE
from great_expectations.dataset import SqlAlchemyDataset
sql_query = """
SELECT id
FROM {tablename}
WHERE TRUE
AND DATE_TRUNC('day', created_at) = CURRENT_DATE - INTERVAL '1' DAY
"""
new_sql_dataset = SqlAlchemyDataset(custom_sql=sql_query, engine=db_engine)
validation_results = new_sql_dataset.validate(expectation_suite="expectations.json")
if validation_results["success"]:
...
132.
133. Example: Model Monitoring with DistributionalExample: Model Monitoring with Distributional
ExpectationsExpectations
135. Gotcha 1: Shared SESSION objectGotcha 1: Shared SESSION object
Objectsinstantiatedonimportaresharedacrossfunctioninvocations
theyonlygetresetduringacoldstart
Ifonetransactionfails,allsubsequenttransactionsinotherinvocationswill
startfailling
from domi.handlers import process_new_listings
from domi.db import SESSION
# 👆 everything instantiated above here is shared across future function invocations
def lambda_handler(event, context):
process_new_listings()
136. WorkaroundWorkaround
# Automatically ensure all transactions are succesfully committed,
# or rolled back if not
def commit_session(_raise=True):
if not SESSION:
return
try:
SESSION.commit()
except Exception as e:
SESSION.rollback()
if _raise:
raise
def session_committer(func):
def wrapper(*args, **kwargs):
try:
return func(*args, **kwargs)
finally:
commit_session()
return wrapper
# Use decorator on any function doing database transactions
@session_committer
def process_new_listings():
...
145. Deploying with Github ActionsDeploying with Github Actions
ianwhitestone.work/AWS-Serverless-Deployments-With-Github-Actions
146. import statsmodels.formula.api as smf
mod = smf.quantreg('foodexp ~ income', data) # uses patsy model formulas
res = mod.fit(q=.5)
print(res.summary())