8. 5th step - Make sure no one understands the pricing
Example: (AWS) Glacier data retrievals are priced based on the peak hourly
retrieval capacity used within a calendar month
15. Some of our hacks:
- Built our own ELB
- Built our own SQS
- Hacked the way Amazon wants us to use EMR
- Rely heavily on Spot instances
16. Thank you
Why yes, my dear, indeed, we are hiring!
boaz.menuhin@gmail.com
Editor's Notes
For many startups in general and for bigdata companies in particular it makes more sense to use cloud providers.
Mostly because of elasticity and because time is spent on business logic and algorithms instead of on maintaining hardware.
The problem is - this is what processing a lot of data looks like when running on cloud
The reason is that every calculation we make is parallelized on thousands of cores in order to finish on time.
The naive usage of cloud providers would have cost us 2 millions of dollars every year.
In order to survive we had to hack the pricing model many times. We had to think like a cloud service provider.
And the first thing one needs to understand is how cloud providers works.
So lets say you want to become a cloud service provider
The first step is just to offer the basics, which are computing and storage.
Notice, storage is just an application on top of computing
Don’t let your users play with others. They don’t need no fancy external dns service. You can provide them with anything they need.
Provide them with DNS service, load balancer, CDN.
Notice: DNS, CDN, loadbalancer are just application built on top of your computing service
Take a google instance, put mysql on and wrap nicely, and you’ve got yourself Google’s Cloud SQL
Take an ec2 instance, put redis on and wrap nicely and you’ve got yourself Amazon’s Elasticache
As noticed - this are just applications on top of your computing service.
Fully utilize idle resources.
You may call it spot\volatile instances, containers, serverless web, whatever. Just find a way to sell your idle resources.
Notice: this is just selling your extra computing services - when they talk about “economy of scale” this is one essential part of it.
It may be written in a confusing way, but the way cloud providers charge is actually rather simple
The X axis is the usage
The y axis if how much you pay
So it makes sense to pay the more you use, but the marginal cost for heavy users is relatively low. And for a good reason.
You see the red line, the red line is where it pays to build your own service on top of the computing service
discount
Should we give up on cloud computing service providers?
Should we build and maintain our own data centers?
Should we go to a deserted island and set a bonfire with stones? Yes - but that’s not the point!
We should hack the pricing model.
Hacking the pricing model is