ARCHITECT
YOUR
SAAS
Oleksandr Mykhalchuk
WHOAMI
SAAS YEARS IN THE
INDUSTRY
13
YEARS IN DEVOPS
7
YEARS AS ARCHITECT
4
PROJECTS IN PROD
12+
3 2
PROD DESIGN
EXPECTATIONS
MONEY
TIME
TECHNOLOGY IS NOT EVERYTHING
DON’T BE AFRAID TO FAIL
LEVEL SET 300
100
200
300
400
Reality
Level
100
200
300
400
AWS’s Expectations
Level
WHAT TO EXPECT?
General
concepts
Decisions
&
Tradeoffs
Real Project
WHY SAAS?
Business
Needs
Economies Of
Scale
Operability
COMMON SAAS PATTERNS IN AWS
SILO BRIDGE POOL
WHAT EVERY SAAS IS BUILD OFF?
APPLICATION OPERATIONS
Tenant
Isolation
Data
Partitioning
Identity
& Access
Management &
Operations
Profiling &
Optimizing
Billing &
Metering
Deployment &
Integration
CHOOSING TENANT ISOLATION
• Segregation across tenants
• Application scalability across tenants
• Level of tenant-specific customizations
• Cost of Deployment
• Operations and management efforts
• Tenant metering and billing
TENANT ISOLATION IN AWS
PROS CONS
AWS Account Layer Complete Isolation ++ Economy of scale --
Managing accounts --
Onboarding / Scaling --
VPC Layer Economy of scale +
Billing(Tags) +
VPC Limits -/--
Networking(VPN) -
VPC Subnet Layer Networking* + VPC Limits (NACL,CIDR, Routing, SG) ---
Container Layer Containers +
Resource utilization +
Containers -
Custom billing -
Application Layer Economy of scale ++
Resource utilization ++
Simplified operations +
Solution Architecture Design –
Security compliance +/-
Serverless Isolation +
Resource utilization +++
Operations ++
Solution Architecture Design –-
ISOLATION DECISIONS
Existing
Enterprise App
Microservices-
heavy App
Container
Layer
Application
Layer
Serverless
Layer
New
Product
FINDING YOUR TENANT MODEL
YOUR SAASPOOL
• SECURITY
• BUSINESS
DATA PARTITIONING
Separate database
per Tenant
Single database,
Multiple schemas
Shared database,
Single schema
POOLSILO BRIDGE
DATA PARTITIONING TRADEOFFS
Silo Model Bridge Model Pool Model
Pros
• Compliance alignment
• No cross-tenant impacts
• Tenant-level tuning
• Tenant-level availability
Cons
• Compromises agility
• Centralized management
• Deployment complexity
• Cost
Pros
• Agility
• Cost optimization
• Centralized management
• Simplified deployment
Cons
• Cross-tenant impacts
• Compliance challenges
• All or nothing availability
DATA PARTITIONING STRATEGY
POOL YOUR SAAS
• SECURITY
• TECHNOLOGY
• BUSINESS
IDENTITY & ACCESS
Tenant
Access
Tenant Provisions Security & Isolation Injecting Tenant Context
IDENTITY & ACCESS
On-Boarding
a Tenant Domain
Provisions
SSL
Certificate
New Tenant
On-Boarding
Identity
Broker
Tenant
Identity
Provider
Tenant
Management
Billing
Tenant
IAM Policy
THERE IS NO SILVER BULLET
• Outsource identity management
• Choose identity stores and protocols wisely
• Use identity brokers
• Keep User Data at minimum
• Avoid old or aging protocols (SAML 2.0)
• Automate role and policy provisioning
THE OPERATIONAL PART. BRIEFLY
MANAGEMENT & OPERATIONS
• Distributed system
metrics
• In-app performance
view
TESTING
• Tenant-onboarding
• Cross-tenant impact
• Tenant isolation tests
• Tier Boundary testing
• Maintenance &
Troubleshooting
Shared layers
OBSERVATION MAINTENANCE
PROFILING & OPTIMIZING
Tenant
Experience
Tenant
Policy
Data
Partitioning
Load/Cost
Optimization
PROFILING & OPTIMIZING
• Data & Metrics are vital
• Look at “busy” tenants first
• Identify general patterns, profiles and trends
• Flexible Data Distribution (Sharding Manager, 2 Layer Sharding)
• Centralized Tenant Policies Management strategy
• Service Granularity helps
• Data Analytics is your best ally
BILLING & METERING
Metrics
Matter
Isolation models
define Cost-
tracking strategy
Knowing your
Cost-per-Tenant
early is crucial
Flexible Tier
models attached
to Tenant Policies
Managed Services
make it simpler
DEPLOYMENT & INTEGRATION
Impact of Multi-Region & Hybrid SaaS Deployment Models
• Tenant Onboarding
• Identity routing
• Monitoring & Billing
• Deployment automation & release strategy
• Network Impact
Private Link & VPC Endpoints
DEPLOYMENT & INTEGRATION
1 2Public Internet
DEPLOYMENT & INTEGRATION
3 4 MarketplaceThird-Party Integrations
CHALLENGES THAT MANY OVERSEE
Data
Migration
Tenant
Onboarding
Automation
Data Evolution
Strategy
Database
Hot Spots
Scaling
Data Layer
PROJECT
A niche communication platform for a financial sector
that provides secure messaging, bots and integration
with other platforms.
• Each customer has own “Silo” under AWS Account
or VPC
• Lots of OPS thinking instead of DevOps
• No automatic Scalability
• Costs
GOALS
Primary
• Automated Customer Onboarding
• Multi-tenant SaaS Platform
• Cost efficiency
• Availability
• Operation Efficiency
Secondary
• Evolved Microservice Architecture
• Decoupled Releases & Independent
Component Deployment
• Focus on Managed Services
• Time-to-market
2 Days 10 Minutes
10-1000 Tenants
2-5x Cost-per-Tenant
99.9-99.95%
CHALLENGES TRANSFORMING
EXISTING SILO APP
• Re-architecting efforts vs Value
• Security pushback
• Fear of change
DECISIONS MADE
Microservices
Mongo
MS SQL
Hadoop + Spark
Overall
Container Layer Tenant Isolation with ECS
SaaS Model BRIDGE /w shared Service and Persistence Layers
DynamoDB /w Shared Database(Table), Single Schema
RDS PostgreSQL /w Single Database(s), Multiple Schemas
EMR
Max out usage of Managed Services*
Legacy (Solr, Cache) ASG + HealthChecks
COST SAVING ON CHANGING DB ENGINE
Develop
Migrate
Operation
200
150
100
50
1 2 3 4 5+
8x db.r4.2xlarge Reserved 1 Y All Upfront
MS SQL Enterprise 53200 USD/month
PostgreSQL 6300 USD/month
Monthly savings 46900 USD/month
20
NET Gain 3 Years
$ 730k+
TENANT
ISOLATION IN
ECS
"placementConstraints": [
{ "expression": "task:tenant == TenantID",
"type": "memberOf"
} ]
• Shared ECS Instances
• Dedicated Tenant ECS
Instances (ECS Instance
Attribute)
LESSONS LEARNED
• Rotate ECS instances weekly
• Automate Tenant Policy update process
• You should be able to “freeze” a separate microservice/stack version in
deployment without affecting the rest
• SignalFx, CloudWatch and ELK are your best friends
• Scaling Persistence Layer with non-cloud-native components is fun
• Complex CloudFormation Stack Updates is even more fun
• Deleting CloudFormation Stacks in the active PROD is the ultimate fun
THINGS I WOULD HAVE DONE
DIFFERENTLY NOW
More Serverless More Global
Lambda Aurora
Serverless
EKS DynamoDB
Global Tables
LAST WORDS
• Know your SaaS patterns
• Always start with the best model
• Make informed tradeoffs
• Data is your key to success
Q&A
AND YES,
WE ARE
HIRING!

"Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv 2019

  • 1.
  • 2.
    WHOAMI SAAS YEARS INTHE INDUSTRY 13 YEARS IN DEVOPS 7 YEARS AS ARCHITECT 4 PROJECTS IN PROD 12+ 3 2 PROD DESIGN
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
    COMMON SAAS PATTERNSIN AWS SILO BRIDGE POOL
  • 10.
    WHAT EVERY SAASIS BUILD OFF? APPLICATION OPERATIONS Tenant Isolation Data Partitioning Identity & Access Management & Operations Profiling & Optimizing Billing & Metering Deployment & Integration
  • 11.
    CHOOSING TENANT ISOLATION •Segregation across tenants • Application scalability across tenants • Level of tenant-specific customizations • Cost of Deployment • Operations and management efforts • Tenant metering and billing
  • 12.
    TENANT ISOLATION INAWS PROS CONS AWS Account Layer Complete Isolation ++ Economy of scale -- Managing accounts -- Onboarding / Scaling -- VPC Layer Economy of scale + Billing(Tags) + VPC Limits -/-- Networking(VPN) - VPC Subnet Layer Networking* + VPC Limits (NACL,CIDR, Routing, SG) --- Container Layer Containers + Resource utilization + Containers - Custom billing - Application Layer Economy of scale ++ Resource utilization ++ Simplified operations + Solution Architecture Design – Security compliance +/- Serverless Isolation + Resource utilization +++ Operations ++ Solution Architecture Design –-
  • 13.
    ISOLATION DECISIONS Existing Enterprise App Microservices- heavyApp Container Layer Application Layer Serverless Layer New Product
  • 14.
    FINDING YOUR TENANTMODEL YOUR SAASPOOL • SECURITY • BUSINESS
  • 15.
    DATA PARTITIONING Separate database perTenant Single database, Multiple schemas Shared database, Single schema POOLSILO BRIDGE
  • 16.
    DATA PARTITIONING TRADEOFFS SiloModel Bridge Model Pool Model Pros • Compliance alignment • No cross-tenant impacts • Tenant-level tuning • Tenant-level availability Cons • Compromises agility • Centralized management • Deployment complexity • Cost Pros • Agility • Cost optimization • Centralized management • Simplified deployment Cons • Cross-tenant impacts • Compliance challenges • All or nothing availability
  • 17.
    DATA PARTITIONING STRATEGY POOLYOUR SAAS • SECURITY • TECHNOLOGY • BUSINESS
  • 18.
    IDENTITY & ACCESS Tenant Access TenantProvisions Security & Isolation Injecting Tenant Context
  • 19.
    IDENTITY & ACCESS On-Boarding aTenant Domain Provisions SSL Certificate New Tenant On-Boarding Identity Broker Tenant Identity Provider Tenant Management Billing Tenant IAM Policy
  • 20.
    THERE IS NOSILVER BULLET • Outsource identity management • Choose identity stores and protocols wisely • Use identity brokers • Keep User Data at minimum • Avoid old or aging protocols (SAML 2.0) • Automate role and policy provisioning
  • 21.
  • 22.
    MANAGEMENT & OPERATIONS •Distributed system metrics • In-app performance view TESTING • Tenant-onboarding • Cross-tenant impact • Tenant isolation tests • Tier Boundary testing • Maintenance & Troubleshooting Shared layers OBSERVATION MAINTENANCE
  • 23.
  • 24.
    PROFILING & OPTIMIZING •Data & Metrics are vital • Look at “busy” tenants first • Identify general patterns, profiles and trends • Flexible Data Distribution (Sharding Manager, 2 Layer Sharding) • Centralized Tenant Policies Management strategy • Service Granularity helps • Data Analytics is your best ally
  • 25.
    BILLING & METERING Metrics Matter Isolationmodels define Cost- tracking strategy Knowing your Cost-per-Tenant early is crucial Flexible Tier models attached to Tenant Policies Managed Services make it simpler
  • 26.
    DEPLOYMENT & INTEGRATION Impactof Multi-Region & Hybrid SaaS Deployment Models • Tenant Onboarding • Identity routing • Monitoring & Billing • Deployment automation & release strategy • Network Impact
  • 27.
    Private Link &VPC Endpoints DEPLOYMENT & INTEGRATION 1 2Public Internet
  • 28.
    DEPLOYMENT & INTEGRATION 34 MarketplaceThird-Party Integrations
  • 29.
    CHALLENGES THAT MANYOVERSEE Data Migration Tenant Onboarding Automation Data Evolution Strategy Database Hot Spots Scaling Data Layer
  • 30.
    PROJECT A niche communicationplatform for a financial sector that provides secure messaging, bots and integration with other platforms. • Each customer has own “Silo” under AWS Account or VPC • Lots of OPS thinking instead of DevOps • No automatic Scalability • Costs
  • 31.
    GOALS Primary • Automated CustomerOnboarding • Multi-tenant SaaS Platform • Cost efficiency • Availability • Operation Efficiency Secondary • Evolved Microservice Architecture • Decoupled Releases & Independent Component Deployment • Focus on Managed Services • Time-to-market 2 Days 10 Minutes 10-1000 Tenants 2-5x Cost-per-Tenant 99.9-99.95%
  • 32.
    CHALLENGES TRANSFORMING EXISTING SILOAPP • Re-architecting efforts vs Value • Security pushback • Fear of change
  • 33.
    DECISIONS MADE Microservices Mongo MS SQL Hadoop+ Spark Overall Container Layer Tenant Isolation with ECS SaaS Model BRIDGE /w shared Service and Persistence Layers DynamoDB /w Shared Database(Table), Single Schema RDS PostgreSQL /w Single Database(s), Multiple Schemas EMR Max out usage of Managed Services* Legacy (Solr, Cache) ASG + HealthChecks
  • 34.
    COST SAVING ONCHANGING DB ENGINE Develop Migrate Operation 200 150 100 50 1 2 3 4 5+ 8x db.r4.2xlarge Reserved 1 Y All Upfront MS SQL Enterprise 53200 USD/month PostgreSQL 6300 USD/month Monthly savings 46900 USD/month 20 NET Gain 3 Years $ 730k+
  • 37.
    TENANT ISOLATION IN ECS "placementConstraints": [ {"expression": "task:tenant == TenantID", "type": "memberOf" } ] • Shared ECS Instances • Dedicated Tenant ECS Instances (ECS Instance Attribute)
  • 38.
    LESSONS LEARNED • RotateECS instances weekly • Automate Tenant Policy update process • You should be able to “freeze” a separate microservice/stack version in deployment without affecting the rest • SignalFx, CloudWatch and ELK are your best friends • Scaling Persistence Layer with non-cloud-native components is fun • Complex CloudFormation Stack Updates is even more fun • Deleting CloudFormation Stacks in the active PROD is the ultimate fun
  • 39.
    THINGS I WOULDHAVE DONE DIFFERENTLY NOW More Serverless More Global Lambda Aurora Serverless EKS DynamoDB Global Tables
  • 40.
    LAST WORDS • Knowyour SaaS patterns • Always start with the best model • Make informed tradeoffs • Data is your key to success
  • 41.
  • 42.