2. who am I?
Osama Khan
Big Data Engineer @ACLServices
Grad Student @GTComputing
AWS Big Data Specialist+
! Vancouver, BC
" : Java " : C# (via J#) # : Python
$ : Golang, NodeJS % : Scala
Previously: Robot Soccer, Credit Rating, AML, O&G
Portfolio, NLP/Governance, Doctor Triage, Energy
Monitoring, Consulting, Private Equity
Recently: Data/ML Pipeline, Tools & Platforms
3. what are we going to
talk about?
The goal of this talk is to provide a high level overview
of the data landscape, introduce AWS and run
through an exercise of containerizing an ML service
1) Data Landscape (v2018): Changing ecosystem
and new roles
2) Just Enough AWS: AWS Intro, EC2, et. al
3) Workshop: RESTful ML Service
4) Demos: Athena, Sagemaker, Quicksight, ModelDB,
Heroku ML API, Docker ML API
9. what’s changing(-ed)?
1. Cloud (faas, serverless data pipelines, ml-as-a-service)
2. Consumer demand for ML features/products/applications
3. Targeted Models (we need to manage 20MM models for 10MM users maybe)
4. Localization (ASEAN facial recognition)
5. Security (Adv. ML, Side-channel attacks)
6. Transparency (Bias is a BUG)
7. Many toy sophisticated solutions but conventional, simpler techniques (regression)
still deliver more business value!
8. Monitoring to ensure deployed models are making high quality predictions
9. Need practices to maintain (update or rebuild) models over time
10. and ….
12. model: monitoring & maintenance
- What models are being deployed? [Model Inventory]
- Are we seeing deviations from expected performance? [Model Output monitoring]
- Reasons for performance degradation? [Data monitoring]
- Take action on out of ordinary situations