ScyllaDB CEO Dor Laor lays out the ten million dollar engineering problem for distributed systems, and how only Scylla is architected to address the issue at the heart of Big Data ROI. He then introduces ScyllaDB's Glauber Costa and Packet's James Malachowski to reveal a new level of performance for a persistent NoSQL datastore. Dor concludes his talk with a bold proposition about how Scylla is uniquely positioned to help companies easily create and scale the software they need to achieve their vision.
15. Scylla and Packet:
Reaching new heights
Glauber Costa, VP Field Engineering @ Scylla
James Malachowski, Solutions Architect @ Packet
16. An IoT application
Total amount of data points
526 billion
temperature readings
1,000,000 sensors, representing homes in an area
365 days (1 year storage requirement) 1 reading per minute
17. Data Model
CREATE TABLE readings (
sensor_id int,
date date,
time time,
temperature float,
PRIMARY KEY ((sensor_id, date), time))
What kind of queries can we reasonably support?
■ SELECT * from readings where sensor_id = ? and date = ?;
■ SELECT * from readings where sensor_id = ? and date = ? and time > ?;
18. Analytics over the entire data?
How long would it take
at normal speeds?
We need more if
analytics are a part of the
pipeline
That means we need Scylla
We need a good application
And we need hardware
200,000 points/second
730 hours (30 days)
1 million points/second
146 hours (almost a week)
19. Why climb Mount Everest?
Because it’s there.
George Leigh Mallory
What kind of performance are we after?
20. Analytics Application
SELECT count(temperature) as totalRows,
sum(temperature) as sumTemperature,
min(temperature) as minTemperature,
max(temperature) as maxTemperature
FROM readings where sensor_id = ? and date = ?`
21. Application
(Example) Total amount of data to scan: 1.44 billion points/day
Coordinator
Worker
(loader machine)
ScyllaDB cluster
Worker
(loader machine)
Worker
(loader machine)
Set time frame,
compute average,
min, max of
all sensors
22. Packet is a global bare metal cloud that is built for Enterprises,
and loved by Developers.
TLDR: We sell servers to millennials.
33. We can efficiently process
over 1.2 billion points
per second
(we’ll process whatever you need, too!)
34. Bare metal + Scylla are
like Peanut Butter & Jelly.
+
Packet & ScyllaDB together, provide a scalable
& low latency database, deployed on a global
bare metal cloud!
42. Make it EASY for software companies
■ Scylla Cloud
43. Make it EASY for software companies
■ Scylla Cloud
■ Project Alternator
Scylla gains DynamoDB compatible API
44. ■ Scylla Cloud
■ Project Alternator
Scylla gains DynamoDB compatible API
Make it EASY for software companies
120 Kops/sec is pretty intensive -
in DynamoDB provisioned pricing,
it would cost $85 per hour.
VMs (EC2, on-demand pricing) for
running Scylla - $7.5 per hour.
Year reservation cheaper for both.
45. Make it EASY for software companies
Scylla Cloud
Workload
prioritization
Scylla
Manager
BackupAlternator
46. Make it EASY for software companies
LWT
Scylla Cloud
Workload
prioritization
Scylla
Manager
BackupAlternator
47. Make it EASY for software companies
LWT
Scylla Cloud
Workload
prioritization
Scylla
Manager
BackupAlternator
CDC
48. Make it EASY for software companies
UDF/
TransformationsLWT
Scylla Cloud
Workload
prioritization
Scylla
Manager
BackupAlternator
CDC
50. Scylla & ScyllaDB
■ Major Round of Funding
■ 2x Engineers
■ 2x Customers
■ $ you spend goes back to you
51. Func6 Customer story
■ Required by a new customer
■ Although code ready, it wasn’t yet tested
■ We had to double down to release it
52. Func6 Customer story
Greg, account
executive
Eyal coordinated
the requirement
Shlomi, assigned
people
Luis, tested the
OSS version
Pekka built +
back ported to
enterprise
Calle fixed bugs
Gleb reviewed
code
Kamil reviewed
& tested
Michal designed,
Fixed, built Scylla
Manager
Amnon adjusted
the monitor
Amos tested
Roy tested Bentsi tested
Julia tested
53. “All Companies are Software Companies”
“Software is Eating the World”
“All Companies are Software”
54. Scylla is a damn good
Choice for *your* Software
56. Where are we going
Backup - point in time, auto-pre-upgrade
Tablets
k8s/manager
57. Release Schedule
2nd Quarter 2020 3rd Quarter 2020
Apr May Jun Jul Aug Sep
4th Quarter 2019
Oct Nov Dec
1st Quarter 2020
Jan Feb Mar
Scylla
Enterprise
2019.1.3
Scylla Manager 2.0
Scylla 3.2
Scylla Enterprise
2019.1.4
Scylla
Enterprise
2020.1
Scylla 4.0
Scylla Manager
2.1
Scylla Manager
3.0
Editor's Notes
Who knows what’s the C10K problem? You’re old!
The C10k problem is the problem of optimising network sockets to handle a large number of clients at the same time
The term was coined in 1999 by Dan Kegel,[3][4]
https://en.wikipedia.org/wiki/C10k_problem
If you knew what C10K is, you’re old! Who uses FTP these days (it’s not a question)
Originally NGINX was developed to solve this! Nice to see idevelopers redeisgn software for better performance and not another layer of indirection.
Now C10K isn’t a real problem today, as you may know with Scylla, we’re bound by the number of ports.
In distributed computing, performance EQUAL cost, it’s a cost optimization. Segment defined the $10M engineering problem (I call it D10M)
It affected the company’s margin and business results - when they approached VCs, the later complained about their margins.
Note that 2 Years ago their problem was a $1M problem..
Segment is a good company and applied lots of good procedures.
Put that in the context of your own scale and speed.
https://segment.com/blog/the-10m-engineering-problem
https://segment.com/blog/the-million-dollar-eng-problem/
How you get to D10M, easy, all it takes is engineers and a cloud account, engineers + cloud == fire
As easy as fire spread in California.
Without efficient infrastructure, you’ll be translating your technical debt to finacial debt.
Similarly, in the database world,
once upon a time, there was a single mysql server.
Scale on from there
Keep on scaling - move to Nosql
Hello scalable architecture
And it grows
And scale with ease
And here you are at the D10M problem
We wanted to test our limits and see what can be achieved with Scylla, 1 million ops is not a news no more.
We empower companies to make infrastructure their competitive advantage with automated bare metal that can be deployed anywhere.
Marc Andreessen penned his famous “Why Software Is Eating the World” followed by
MSFT Satya Nadella.
You can see this becomes a reality with Kylie Jenner’s 7 pp company, fulfilled by Shopify
https://www.shopify.com/kylie
Recently Confluent’s Jay Kreps made another observation that all companies are software.
Indeed businesses today are very different.
In the old days, in order to take a loan, you’d physically arrive to the bank, fill a form and then
The manual process would kick in. The credit officer reviews the request, passes it on to the risk officier, ..Couple of days later, you’d get your approval.
Today, businesses like next insurance, Sofi, Lemonade automate everything and
Can provide you an answer within minutes, some need to provide an answer in milliseconds
There are no officers but services, micro services
These services run 247 and move lots of data around in a merid of microservices and databases.
It’s all complex and data passes through the pipes far too many times.
We at Scylla wish to make it easy for software companies to create better software
Scylla Cloud DBaaS is a very good start. We launched it a year ago and it’s been in GA since March.
Compatibility is important as well. 2 months ago we presented project alternator.
The initial release was open source.
A Scylla alternative for the widely used DynamodB at 10x the speed.
TODAY we announce the availability of alternator on our cloud as beta.
Uber-app like experience - one click to fantastic performance
A simple 3 medium node cluster can do 120k ops. A similar DynamoDB
Pricing costs $85/hour while the on demand VMs cost $7.5
We’ve made these improvements, from the cloud, through workload prioritization and Scylla Manager automatic backups
But what next?
Today we announce the availability of Lightweight Transactions a long anticipated feature in Scylla.
Tomorrow Kostja will present LWT. We look at LWT not as a regular feature but
As a key building block ahead, such that will allow us to improve Scylla itself (tablets, schema mgmt)
And also to open the door to variety of more use cases to you all
Next is Change Data Capture. We’ve given good amount of thought and design
And I’m super pleased to announce the availability of CDC in OSS.
Once again, we like to help build software companies and simplify your stack.
CDC will allow to think about tables and streams in an interchangeable way
Last but not least, user defined functions is here.
UDF opens the gate to much more. Think about ETL within the Scylla servers
Themselves with the Scylla parallelism. We’re not there yet but this is where we’re going
Think avout filtering your stream of changes on the server and apply actions on it.
Not only Scylla as a database has made progress, also Scylla as a company.
We closed a major round of funding.
We do developers all hands in between the scylla summits.
We’re bigger and busier than ever but still put all of the emphasis in support.
For example, one of our customers needed a functionality which was coded but not yet hardened.
The contract was signed so we had to double down on it.
Bob my good VP of marketing indicated that I’m quoting other people and whether
I like to coin something myself. No sure what’s left for me, can’t shrink these sentences even
More - software is software? Doesn’t make sense.
Instead, I can say much less hyperbolic claim