Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A detailed look at why mabl Chose Google Cloud Platform (GCP) over AWS.


Published on

James Baldassari, a Full Stack Developer from Raytheon, DataXu, and now mabl, discussed why mabl chose GCP with members of the Ministry of Testing - Boston. 

This talk overs: 
• What is a cloud service provider? Why use it? 
• What features do cloud services provide? 
• What were mabl's business needs?

Published in: Software
  • Login to see the comments

A detailed look at why mabl Chose Google Cloud Platform (GCP) over AWS.

  1. 1. A detailed look at why mabl Chose Google Cloud Platform (GCP) over AWS James Baldassari Engineer @mablhq
  2. 2. A little bit of background ■ Mabl started in early 2017 setting out to build an ml-driven end to end testing service ■ The early team had to decide on a cloud platform ▲ Early engineers had a lot of experience with AWS but nonetheless decided to evaluate both GCP and AWS ■ Despite the founders coming from Google, the team made an objective, hands on evaluation of several key services across AWS and GCP. This presentation is a summary of the comparison.
  3. 3. Platform Capabilities & Requirements Initial focus: Front-end black box testing ■ Execute tests in a real web browser ■ Capture test output: sources, screenshots, timing information, etc. ■ Perform analysis on test output, and apply ML techniques ■ Surface insights about applications, environments, and tests ■ Incorporate user feedback ■ React in near real time as tests are executing ■ All of the above must scale automatically as the business grows
  4. 4. (Very) High Level Architecture Modern SPA UI HTTP REST APIs Entity Database Test Execution ML & Analysis Analytics Database File Persistence
  5. 5. Where should we build this in Q1 17? AWS is no longer the only option available to developers
  6. 6. Rough idea of services to compare Product Category File Storage S3, Glacier Cloud Storage Pub/Sub Messaging Kinesis, SQS PubSub NoSQL Database DynamoDB Datastore, Bigtable Auto-scaling HTTP endpoints Elastic Beanstalk AppEngine Container Services EC2 Container Service (ECS), EC2 Container Registry Kubernetes Engine (GKE), Container Registry UI Asset Hosting S3, Cloudfront Firebase 6
  7. 7. Continued... Product Category ETL & Analytics Spark on ElasticMapReduce Dataflow, Dataproc Analytics Database Redshift Bigquery Serverless Lambda Cloud Functions Machine Learning AWS Machine Learning ML Engine, Datalab Monitoring/Logging Cloudwatch Stackdriver Infrastructure tooling Cloud Formation Deployment Manager 7
  8. 8. Pub/Sub Messaging: AWS Kinesis vs. GCP Pub/Sub Both services are similar but for two areas: 8 ■ Behavior of new subscriptions ▲ Kinesis: New subscribers can read messages published before the subscription (up to trim horizon) ▲ Pub/Sub: New subscribers can only read messages published after the subscription ■ Pub/Sub is easier to scale and has a better pricing model ▲ Kinesis: pre-allocate capacity (shards), and scale by triggering shard splits and merges ● ProvisionedThroughputExceeded error will ruin your day ▲ Pub/Sub: pay-by-throughput (requests * bytes) and scales automatically Advantage: GCP
  9. 9. NoSQL Database: AWS DynamoDB vs. GCP Datastore These services are similar, but Datastore has fixed some of the rough edges of DynamoDB. The biggest difference is: 9 ■ Datastore is easier to scale and has a better pricing model ▲ DynamoDB requires pre-allocating throughput capacity up front and adjusting as needed ● ProvisionedThroughputExceeded error will ruin your day ● Note: since we performed our evaluation Dynamo has introduced some auto-scaling features ▲ Datastore is pay-by-request (plus data storage) ● It scales automatically to handle whatever you throw at it Advantage: GCP
  10. 10. Serverless: AWS Lambda vs. GCP Cloud Functions 10 ■ AWS Lambda ▲ Mature service (GA) ▲ Supports Node.js, Java, Python, C# ■ GCP Cloud Functions ▲ Still in beta ▲ Only supports Node.js Advantage: AWS
  11. 11. Container Registry/Service: AWS ECS vs. GCP GKE 11 ■ Google Kubernetes Engine (GKE) ▲ GCP managed vanilla Kubernetes (open source) cluster ▲ Easy to migrate to any cloud platform where a Kubernetes cluster can run ■ EC2 Container Service (ECS) ▲ Docker-compatible container service leveraging AWS proprietary orchestration ▲ Note: AWS has very recently added a Kubernetes option Advantage: GCP
  12. 12. Analytics: AWS EMR + Spark vs. GCP Dataflow 12 ■ EMR is a hosted Hadoop cluster ■ Dataflow is a managed service based on Apache Beam ▲ Dataflow makes it simple to deploy pipelines and update them in place ▲ No need to manage a Hadoop cluster ■ Beam has a unified batch and streaming API ■ Beam pipelines can run in Flink, Dataflow, or even Spark Advantage: GCP
  13. 13. Machine Learning: AWS ML vs. GCP ML Engine 13 ■ AWS ML ▲ Limited ML service designed for one type of model: logistic regression ■ GCP ML Engine ▲ General ML solution which can be used with any type of ML model ▲ GCP also has Datalab, a version of Jupyter that integrates with GCP services ■ Overall, GCP’s ML capabilities feel more mature and flexible than AWS Advantage: GCP
  14. 14. Infrastructure and deployment automation 14 ■ AWS ▲ Infrastructure: AWS has a mature product in Cloud Formation for build/deployment of infrastructure and also supports third party tools like Terraform ▲ Code: Use a combination of Cloud Formation, CodeDeploy, AWS CLI, and third-party frameworks such as Ansible ■ GCP ▲ Infrastructure: GCP has Cloud Deployment Manager which supports a fairly limited set of GCP services ▲ GCP: Most software is deployed with GCP's CLI tools Advantage: AWS
  15. 15. Testing Support 15 ■ Testing locally without having to run your code in the cloud makes testing faster and easier ■ AWS has very limited functionality to allow to test locally with only DynamoDB having an emulator you can run locally for unit tests ■ GCP has several services with local emulator support including Datastore, AppEngine, Dataflow, and even Cloud Functions Advantage: GCP
  16. 16. The final tally ■ Other factors: ▲ Community support/answers: AWS ▲ Start-up credits: both platforms offer generous incentives ▲ Cost: difficult to compare between clouds due to different pricing models ● AWS has a free tier for most of its products ● GCP has a free trial period 16 ■ That being said, we ultimately decided to go with GCP for 3 primary reasons...
  17. 17. Deciding Factors: Hands-off Managed Services ■ Many of GCP's services require less hand holding and monitoring than the AWS equivalents: ▲ Pub/Sub ▲ Datastore ▲ Dataflow ■ This allows us to focus more on Dev than Ops
  18. 18. Deciding Factors: Portability vs. Vendor Lock-in ■ While vendor lock-in is inevitable to some extent with all cloud providers, more of GCP's services were built around portable interfaces/frameworks ■ We felt that we could get started with frameworks like Kubernetes and Beam and then fairly easily switch to AWS if necessary
  19. 19. Deciding Factors: Machine Learning ■ Google’s ML capabilities are more flexible and mature than the comparable AWS services, and machine learning is core to what we were planning to build
  20. 20. Final Thoughts ■ The choice between AWS and GCP was a difficult one ▲ We don't think AWS is a bad platform ▲ We definitely could have built mabl there ■ For our needs we felt that GCP would give us a slight edge ▲ Less operational overhead ▲ Easier to scale ▲ Greater flexibility: ML and portability of key services ■ Reflections after ~8 months on GCP ▲ We've definitely had a few issues, mostly with beta services ● Unplanned downtime ● Bugs ▲ Overall we're still happy with GCP, and their support has been responsive
  21. 21. Questions?