Google Cloud Spanner
Meetup 26-2-2017
Vadim Solovey //CTO (vadim@doit-intl.com)
DoIT International confidential │ Do not distribute
About us..
Vadim Solovey
CTO
DoIT International confidential │ Do not distribute
DoIT International confidential │ Do not distribute
DoIT International confidential │ Do not distribute
SQL?
Strong consistency
Standard Query Language
ACID transactions
Horizontally scalable
Highly available
or No-SQL?
Confidential + Proprietary
“NewSQL is a class of modern relational database management systems that seek to
provide the same scalable performance of NoSQL systems for online transaction
processing (OLTP) read-write workloads while still maintaining the ACID guarantees of
a traditional database system. [...] Example systems in this category are Google
Spanner …”
Source: https://en.wikipedia.org/wiki/NewSQL
Confidential + Proprietary
It is impossible for a distributed computer system to simultaneously provide more
than two out of three of the following guarantees: Consistency. Availability. Partition
Tolerance.
CAP Theorem
“Cloud Spanner is not just software. It is the union of
software, hardware — in the form of atomic clocks in
Google’s data centers — and an incredibly robust network
connecting their data centers together. So it’s not just
writing code. It’s a lot of investment and a lot of operational
expertise that Google excels at.”
Nick Heudecker
Research Director, Gartner
With Cloud Spanner you enjoy all the
traditional benefits of a SQL database:
● ACID transactions
● High Availability through synchronous
replication
● Schemas (w/ changes without
downtime),
● SQL Queries
● Scales Horizontally
● Managed by Google SRE team
Cloud Spanner 101
Best of Both Relational & NoSQL
Cloud Spanner Traditional Relational Traditional NoSQL
Schema ✓ Yes ✓ Yes X No
SQL ✓ Yes ✓ Yes X No
Consistency ✓ Strong ✓ Strong X Eventual
Availability ✓ High X Failover ✓ High
Scalability ✓ Horizontal X Vertical ✓ Horizontal
Replication ✓ Automatic ↻ Configurable ↻ Configurable
Cloud StorageCloud Bigtable
Cloud
Datastore
Cloud SQL
Good for:
Binary or object
data
Such as:
Images, Media
serving, backups
Good for:
Hierarchical,
mobile, web
Such as:
User profiles,
Game State
Good for:
Web
frameworks
Such as:
CMS,
eCommerce
Good for:
Heavy read +
write, events,
Such as:
AdTech,
Financial, IoT
Cloud
Memorystore
Good for:
Web/mobile apps,
gaming
Such as:
Game state, user
sessions
EAP
Cloud
Spanner
Beta
Good for:
RDBMS+scale,
HA, HTAP
Such as:
User metadata,
Ad/Fin/MarTech
BigQuery
Good for:
Enterprise Data
Warehouse
Such as:
Analytics,
Dashboards
In Memory Relational Non-Relational Object Warehouse
Cloud Database Portfolio
Pricing
No ops or I/O to provision
Storage auto-scales, no storage provisioning required
Nodes
● $0.90 / hour / node (includes 3 replications)
Storage
● SSD: $0.30 GB / month (includes replication)
Network
● Standard cross-region and Internet egress
● Free: Ingress, egress within region
Other solutions on the market
Cloud Spanner Oracle
AWS
Aurora
AWS
DynamoDB
Azure
DocumentDB
MongoDB Cassandra
Type Scale out relational RDBMS RDBMS Key-value Document Document Wide-column
Schema Yes Yes Yes No No No No
SQL Native Native Native No Limited No CQL
Consistency (Default) Strong (global)
Strong
(datacenter)
Strong
(within AZ)
Tunable Tunable Eventual Tunable
Availability
99.99% *
(multi-region: 5 9s)
User
configured
99.99% Unspecified 99.99% Unspecified Unspecified
Data-layer Encryption Yes Yes Within Region Client-side No Not by default Datastax
Scalability Horizontal within DC Vertical Horizontal Horizontal Horizontal Horizontal
Replication
Regional
(multi-region: 2017)
Datacenter Regional Multi-region Multi-region User configured User configured
Managed Service Yes Yes Yes Yes Yes
Atlas
3rd Party Cloud
3rd party
TCO Comparisons
Cloud Spanner
(regional replication)
Cloud SQL
(HA)
Cloud Bigtable
(unreplicated)
AWS Aurora
AWS
DynamoDB
Azure
DocumentDB
Resource-based Resource-based Resource-based 3Y RI Pricing On-Demand per-op per-op
Read-heavy workload (50GB storage)
$2,094 $2,226 $1,021 $973 $1,744 $2400 $1887
Mixed Workload (50GB storage)
$2,094 $2,226 $1,021 $973 $1,744 $4,398 $5,333
Interaction
gRPC and RESTful client libraries available:
● Java
● Python
● Golang
● NodeJS
● Ruby (upcoming)
● PHP (upcoming)
JDBC Driver is Available as well for limited legacy apps support.
Google Cloud CLI (work with instances, databases and run queries)
Data Types & Data Definition Language
Data Types Available:
● BOOL, INT64, FLOAT64, STRING( length ), BYTES( length ), DATE, TIMESTAMP
● ARRAY of scalar types (no access to individual members, read or write the entire array)
Use Cloud Spanner's Data Definition Language (DDL) to work with databases, tables and indexes
● CREATE
● ALTER
● DROP
Expressions, Functions, and Operators
● CASTing i.e. CAST(x=1 AS STRING)
● Aggregations, i.e. COUNT, MIN, MAX, AVG, BIT*, SUM
● Mathematical, i.e. SQRT(X)
● String, i.e. LENGTH(value) or SUBSTR(value, position[, length])
● Array, i.e. ARRAY_LENGTH(array_expression)
● Date/Time, i.e. DATE_DIFF(date_expression, date_expression, date_part)
● Conditional, i.e. WHEN, CASE, IF, COALESCE
Best Practices & Performance
● Each node can provide up to 10K QPS of reads / 2K QPS of 1KB writes and 2 TiB storage
● Minimum of 3 nodes recommended for production environments (min is one node)
● Carefully choose a primary key (to avoid hotspots)
Product Roadmap for 2017
● Multi-Regional replication
● Dataflow | Pub Sub | BigQuery integrations
● Local mock server
● JSON support (repeated and nested fields)
● Writes in SQL
Spanner Resources
● Documentation: cloud.google.com/spanner/docs
● Podcast with Deepti Srivastava: goo.gl/tlGAx4
● Optimizing Schema Design for Cloud Spanner: https://goo.gl/4KG1hZ
● Spanner, TrueTime and the CAP Theorem - https://research.google.com/pubs/pub45855.html
● Whitepaper Explained: by Wilson Hsieh - youtube.com/watch?v=NthK17nbpYs
● The Life of Cloud Spanner Reads & Writes - https://goo.gl/vXK1r4
● Life of a Cloud Spanner Query - https://goo.gl/mCyTvW
DEMO

Google Cloud Spanner Preview

  • 1.
    Google Cloud Spanner Meetup26-2-2017 Vadim Solovey //CTO (vadim@doit-intl.com)
  • 2.
    DoIT International confidential│ Do not distribute About us.. Vadim Solovey CTO
  • 3.
    DoIT International confidential│ Do not distribute
  • 4.
    DoIT International confidential│ Do not distribute
  • 5.
    DoIT International confidential│ Do not distribute
  • 6.
    SQL? Strong consistency Standard QueryLanguage ACID transactions Horizontally scalable Highly available or No-SQL?
  • 7.
    Confidential + Proprietary “NewSQLis a class of modern relational database management systems that seek to provide the same scalable performance of NoSQL systems for online transaction processing (OLTP) read-write workloads while still maintaining the ACID guarantees of a traditional database system. [...] Example systems in this category are Google Spanner …” Source: https://en.wikipedia.org/wiki/NewSQL
  • 8.
    Confidential + Proprietary Itis impossible for a distributed computer system to simultaneously provide more than two out of three of the following guarantees: Consistency. Availability. Partition Tolerance. CAP Theorem
  • 9.
    “Cloud Spanner isnot just software. It is the union of software, hardware — in the form of atomic clocks in Google’s data centers — and an incredibly robust network connecting their data centers together. So it’s not just writing code. It’s a lot of investment and a lot of operational expertise that Google excels at.” Nick Heudecker Research Director, Gartner
  • 10.
    With Cloud Spanneryou enjoy all the traditional benefits of a SQL database: ● ACID transactions ● High Availability through synchronous replication ● Schemas (w/ changes without downtime), ● SQL Queries ● Scales Horizontally ● Managed by Google SRE team Cloud Spanner 101
  • 11.
    Best of BothRelational & NoSQL Cloud Spanner Traditional Relational Traditional NoSQL Schema ✓ Yes ✓ Yes X No SQL ✓ Yes ✓ Yes X No Consistency ✓ Strong ✓ Strong X Eventual Availability ✓ High X Failover ✓ High Scalability ✓ Horizontal X Vertical ✓ Horizontal Replication ✓ Automatic ↻ Configurable ↻ Configurable
  • 12.
    Cloud StorageCloud Bigtable Cloud Datastore CloudSQL Good for: Binary or object data Such as: Images, Media serving, backups Good for: Hierarchical, mobile, web Such as: User profiles, Game State Good for: Web frameworks Such as: CMS, eCommerce Good for: Heavy read + write, events, Such as: AdTech, Financial, IoT Cloud Memorystore Good for: Web/mobile apps, gaming Such as: Game state, user sessions EAP Cloud Spanner Beta Good for: RDBMS+scale, HA, HTAP Such as: User metadata, Ad/Fin/MarTech BigQuery Good for: Enterprise Data Warehouse Such as: Analytics, Dashboards In Memory Relational Non-Relational Object Warehouse Cloud Database Portfolio
  • 13.
    Pricing No ops orI/O to provision Storage auto-scales, no storage provisioning required Nodes ● $0.90 / hour / node (includes 3 replications) Storage ● SSD: $0.30 GB / month (includes replication) Network ● Standard cross-region and Internet egress ● Free: Ingress, egress within region
  • 14.
    Other solutions onthe market Cloud Spanner Oracle AWS Aurora AWS DynamoDB Azure DocumentDB MongoDB Cassandra Type Scale out relational RDBMS RDBMS Key-value Document Document Wide-column Schema Yes Yes Yes No No No No SQL Native Native Native No Limited No CQL Consistency (Default) Strong (global) Strong (datacenter) Strong (within AZ) Tunable Tunable Eventual Tunable Availability 99.99% * (multi-region: 5 9s) User configured 99.99% Unspecified 99.99% Unspecified Unspecified Data-layer Encryption Yes Yes Within Region Client-side No Not by default Datastax Scalability Horizontal within DC Vertical Horizontal Horizontal Horizontal Horizontal Replication Regional (multi-region: 2017) Datacenter Regional Multi-region Multi-region User configured User configured Managed Service Yes Yes Yes Yes Yes Atlas 3rd Party Cloud 3rd party
  • 15.
    TCO Comparisons Cloud Spanner (regionalreplication) Cloud SQL (HA) Cloud Bigtable (unreplicated) AWS Aurora AWS DynamoDB Azure DocumentDB Resource-based Resource-based Resource-based 3Y RI Pricing On-Demand per-op per-op Read-heavy workload (50GB storage) $2,094 $2,226 $1,021 $973 $1,744 $2400 $1887 Mixed Workload (50GB storage) $2,094 $2,226 $1,021 $973 $1,744 $4,398 $5,333
  • 16.
    Interaction gRPC and RESTfulclient libraries available: ● Java ● Python ● Golang ● NodeJS ● Ruby (upcoming) ● PHP (upcoming) JDBC Driver is Available as well for limited legacy apps support. Google Cloud CLI (work with instances, databases and run queries)
  • 17.
    Data Types &Data Definition Language Data Types Available: ● BOOL, INT64, FLOAT64, STRING( length ), BYTES( length ), DATE, TIMESTAMP ● ARRAY of scalar types (no access to individual members, read or write the entire array) Use Cloud Spanner's Data Definition Language (DDL) to work with databases, tables and indexes ● CREATE ● ALTER ● DROP
  • 18.
    Expressions, Functions, andOperators ● CASTing i.e. CAST(x=1 AS STRING) ● Aggregations, i.e. COUNT, MIN, MAX, AVG, BIT*, SUM ● Mathematical, i.e. SQRT(X) ● String, i.e. LENGTH(value) or SUBSTR(value, position[, length]) ● Array, i.e. ARRAY_LENGTH(array_expression) ● Date/Time, i.e. DATE_DIFF(date_expression, date_expression, date_part) ● Conditional, i.e. WHEN, CASE, IF, COALESCE
  • 19.
    Best Practices &Performance ● Each node can provide up to 10K QPS of reads / 2K QPS of 1KB writes and 2 TiB storage ● Minimum of 3 nodes recommended for production environments (min is one node) ● Carefully choose a primary key (to avoid hotspots)
  • 20.
    Product Roadmap for2017 ● Multi-Regional replication ● Dataflow | Pub Sub | BigQuery integrations ● Local mock server ● JSON support (repeated and nested fields) ● Writes in SQL
  • 21.
    Spanner Resources ● Documentation:cloud.google.com/spanner/docs ● Podcast with Deepti Srivastava: goo.gl/tlGAx4 ● Optimizing Schema Design for Cloud Spanner: https://goo.gl/4KG1hZ ● Spanner, TrueTime and the CAP Theorem - https://research.google.com/pubs/pub45855.html ● Whitepaper Explained: by Wilson Hsieh - youtube.com/watch?v=NthK17nbpYs ● The Life of Cloud Spanner Reads & Writes - https://goo.gl/vXK1r4 ● Life of a Cloud Spanner Query - https://goo.gl/mCyTvW
  • 22.