This document provides an overview of Google Cloud Fundamentals. It introduces Andrew Liaskovski as the teacher and covers various Google Cloud topics including migration, security, DevOps, big data, and disaster recovery services. It also discusses CloudZone's full service package including consulting, managed services, and professional services. The rest of the document focuses on specific Google Cloud products and services such as Compute Engine, App Engine, Container Engine, Cloud Storage, Cloud SQL, networking, big data, and machine learning.
2. About CloudZone
End-to-end
Cloud Solutions
• Migration
• Security
• DevOps
• Big Data
• DR & more
Full Service
Package
• Consulting
• Managed Services
• Professional
Services
Years of
Experience
Largest
partner in
Israel
Our Goal:
To ensure our
customers adopt
the most
advanced
technologies at
a minimal cost
3. What We Do…
FinOps DevOps Well Architected
• Architecture
schematics
• TCO
• SOW
Full Service Package
• Consulting
• Managed Services
• Professional Services
Continues
Integration and
Deployment
5. Customer Success Unit (FinOps)
Design for Maximal
Cost Reduction
Leverage the Power
Of the Cloud
Find and Eliminate
Waste
Implement
Government Polices
Helping you save thousands on your monthly bill!
• Billing Analytics
• RI Utilization
• Data Analytics
• Automate Optimization
• Asset Management
• Consumption Management
Cloud Resource
ManagementTool
6. Hybrid Cloud Solutions
• Architecture & deployment
of Private Hybrid clouds
based on VMware vRA or
OpenStack
• Configuration Management:
Chef, Puppet, Ansible & Salt
• Public Cloud DevOps &
Automation
• OpenShift, Cloud Foundry
based, Kubernetes, DCOS
and More.
Cloud Containers DevOps
12. Compute
App Engine
Compute
Engine
Container
Engine
Container
Registry
Cloud
Functions
Networking
Cloud DNS
Virtual Private
Cloud
Cloud Load
Balancing
Cloud CDN
Cloud
Interconnect
Big Data
BigQuery
Cloud
Dataflow
Cloud
Dataproc
Cloud
Datalab
Cloud
Pub/Sub
Genomics
Storage and Databases
Cloud
Bigtable
Cloud
Storage
Cloud
Datastore
Cloud SQL
Cloud
Spanner
Identity & Security
Cloud IAM
Cloud Resource
Manager
Cloud Security
Scanner
BeyondCorp
Data Loss
Prevention
Identity-Aware
Proxy
Security Key
Enforcement
Persistent
Disk
Machine Learning
Cloud Machine
Learning
Cloud Vision
API
Cloud Speech
API
Cloud Natural
Language API
Cloud
Translation
API
Cloud
Jobs API
Networking
Key
Management Service
Cloud
Router
VPN
Firewall
External IP
More than 60 Google Cloud Platform services
13. Management Tools
Stackdriver Monitoring Logging
Error
Reporting
Trace
Debugger
Cloud
Deployment
Manager
Cloud
Endpoints
Cloud
Console
Developer Tools
Cloud SDK
Cloud
Deployment
Manager
Cloud Source
Repositories
Cloud
Tools for
Android Studio
Cloud Tools
for IntelliJ
Cloud
Tools for
PowerShell
Cloud
Tools for
Visual Studio
Google Plug-in
for Eclipse
Cloud Test
Lab
Cloud Shell
Cloud Mobile App
Cloud Billing
API
Cloud APIs
More than 60 Google Cloud Platform services
15. The mobile developer
Batch and burst compute
The launch of Cloud Dataflow
Developer Productivity tools
Web serving workloads
Connecting you to the cloud
Kubernetes and containers
Super scaleable storage
Focus on Innovation
23. Google is a leader in Open Source
287,024
Commits by Googlers
to Open Source Projects on
GitHub in 2016
15,000+ Projects Contributed
to in 2016
2,500 Projects That Have Had
10+ Events from Googlers
24. On & Off Growing Fast
• On & off workloads
(e.g. batch job)
• Over provisioned
capacity is wasted
Cloud Computing Patterns
Unpredictabl
e Bursting
Predictable
Bursting
• Successful services needs
to grow/scale
• Keeping up with growth
is a big IT challenge
• Services with micro
seasonality trends
• Peaks due to
periodic increased
demand result in
wasted capacity
compute
Inactivity
Period
compute
tt
compute
t
compute
t
• Unexpected/unplanned
peak in demand
• Sudden spike
impacts performance
25. Everything You Need To Build And Scale
Compute
From virtual machines with
proven price/performance
advantages to a fully managed
app development platform.
Compute Engine
App Engine
Container Engine
Container Registry
Cloud Functions
Storage and Databases
Scalable, resilient, high
performance object storage
and databases for your
applications.
Cloud Storage
Cloud Bigtable
Cloud Datastore
Cloud SQL
Networking
State-of-the-art software-
defined networking products
on Google’s private fiber
network.
Cloud Virtual Network
Cloud Load Balancing
Cloud CDN
Cloud Interconnect
Cloud DNS
Management Tools
Monitoring, logging, and diagnostics
and more, all a easy to use web
management console or mobile app.
Stackdriver Overview
Monitoring
Logging
Error Reporting
Debugger
Deployment Manager & More
Big Data
Fully managed data warehousing,
batch and stream processing,
data exploration, Hadoop/Spark,
and reliable messaging.
BigQuery
Cloud Dataflow
Cloud Dataproc
Cloud Datalab
Cloud Pub/Sub
Genomics
Machine Learning
Fast, scalable, easy to use ML
services. Use our pre-trained
models or train custom models on
your data.
Cloud Machine Learning
Platform
Vision API
Speech API
Translate API
Developer Tools
Develop and deploy your
applications using our command-
line interface and other developer
tools.
Cloud SDK
Deployment Manager
Cloud Source Repositories
Cloud Endpoints
Cloud Tools for Android
Studio
Cloud Tools for IntelliJ
Google Plugin for Eclipse
Cloud Test Lab
Identity & Security
Control access and visibility to
resources running on a platform
protected by Google’s security
model.
Cloud IAM
Cloud Resource Manager
Cloud Security Scanner
Cloud Platform Security
Overview
27. Products & services Compute
Compute Engine
Run large-scale
workloads on
virtual machines
hosted on Google's
infrastructure
App Engine
A platform for
building scalable
web apps and
mobile backends
Container Engine
Run Docker containers
on Google's
infrastructure,
powered by
Kubernetes
Container
Registry
Fast, private
Docker image
storage on Google
Cloud Platform
Cloud Functions ALPHA
A serverless platform
for building event-based
microservices triggered
by events in GCP
28. High-performance Virtual Machines
Consistently performant, scalable, highly secure & reliable.
(Really) Pay for what you use
We bill in minute-level increments so you don’t pay for unused computing time, and
automatically apply sustained use discounts.
Fast, Easy Provisioning
Quickly deploy large clusters of virtual machines with intuitive tools.
Compliance & Security
All data written to disk in Compute Engine is encrypted on the fly and then
transmitted and stored in encrypted form.
Batch
Run short duration, heavy compute jobs. The more flexibility in timing and location
you give us the better the pricing!
$
29. Why use Google Compute Engine?
You have an infrastructure-centric view of the
world.
● You need complete control over the virtual-machine
infrastructure.
● You need to make kernel-level changes, such as
providing your own network or graphic drivers, to
squeeze out the last drop of performance.
● You need to run a software package that can’t easily
be containerized or you have existing VM images to
move to the cloud.
Virtual machines
with industry-leading
price/performance
Compute Engine
30. Declarative management
Declare your containers’ requirements, such as the amount of CPU/memory to reserve and
keepalive policy, in a simple JSON config file. Container Engine will schedule your containers
as declared.
Better ops
Your container cluster is equipped with capabilities, such as logging, container health
checking, and autoscaling, to make application management easier.
Docker support
Container Engine supports the common Docker container format. And with Google Container
Registry, Cloud Platform makes it easy to store and access your private Docker images.
Managed container cluster
Spin up a managed container cluster of virtual machines, ready for deployment. The vm
nodes that comprise your cluster are fully managed, ensuring they are healthy and updated
with critical patches.
Container
Engine
Cloud flexibility
With Red Hat, Microsoft, IBM, Mirantis OpenStack, and VMWare -- and the list keeps growing
-- working to integrate Kubernetes into their platforms, you’ll be able to move workloads, or
take advantage of multiple cloud providers, more easily.
Container Engine
31. *Source: COCOMO Model
4,000+ Projects Based on Kubernetes
442
Years of
Effort* 15,000 Contributors 20k+ GitHub Stars
Kubernetes
32. You have a container-centric view of the world.
● Deploying or maintaining a fleet of VMs has been
a challenge and you’ve determined that
containers are the solution.
● You’ve containerized your workload and need a
system on which to run and manage it.
● You never want to touch a server or
infrastructure.
● You don’t have dependencies on kernel changes
or on a specific (non-Linux) operating system.
Why use Google Container Engine?
Cluster manager and
orchestration engine
built on Google’s
container experience
Container Engine
33. Powerful built-in services
Managed services, such as Task Queues, Memcache and the Users API, let you build any
application.
Deploy at Google scale
You can scale up to 7 billion requests per day and automatically scale down when traffic subsides.
Focus on your code
Let Google worry about database administration, server configuration, sharding & load balancing.
Popular languages & frameworks
Write applications in some of the most popular programming languages, use existing frameworks
and integrate with other familiar technologies.
Familiar development tools
Use the tools you know, including Eclipse, IntellIJ, Maven, Git, Jenkins, PyCharm & more.
Multiple storage options
Choose the storage option you need: a traditional MySQL database using Cloud SQL, a schemaless
NoSQL datastore, or object storage using Cloud Storage.
App Engine
34. You have an app-centric view of the world.
● You want to focus on writing code and never
touch a server, cluster, or infrastructure.
● Building quickly and time to market are highly
valued.
● You want to sleep at night and not worry about a
pager going off or 5XX web errors.
● You expect your app to have high availability
without a complex architecture.
Why use Google App Engine?
A flexible, zero ops
platform for building
highly available apps
App Engine
36. Products & services Storage and Databases
Cloud Storage
Powerful, simple and cost
effective object storage
service with global edge-
caching
Cloud Bigtable BETA
Cloud Bigtable is a fast,
fully managed, massively
scalable NoSQL database
service
Cloud Datastore
A managed, NoSQL,
schemaless database for
storing non-relational
data
Cloud SQL
Store and manage data
using a fully-managed,
relational MySQL
database
37. Durable Reduced Availability Storage
Durable Reduced Availability (DRA) Storage has lower cost and lower availability than
Standard Storage but is designed to have the same durability and performance as Standard
Storage.
Standard Storage
Standard Storage provides the highest level of durability, availability and performance of all
Google Cloud Storage services. It’s specifically designed for use cases requiring low latency
and frequent data access, such as website content distribution and video streaming. Standard
storage is all about performance.
Simple Access
Google created three simple product options to help you improve the performance of your
applications while keeping your costs low. These three product options use the same API,
providing you with a simple and consistent method of access.
Cloud Storage
Nearline Storage
Nearline Storage is a low-cost, highly durable storage service for data archiving, online
backup, and disaster recovery. Unlike other cloud or on-premises archive and backup
offerings, you don’t have to wait hours or days to retrieve or access your data.
38. Google Cloud Storage classes in detail
Multi-Regional Regional Nearline Coldline
Use cases
Content storage and
delivery, business
continuity
Store data for big data
analytics and general
compute
Long-tail content,
infrequently accessed
data, backup
Archival, disaster
recovery, rarely
accessed content
Price / GB-mo
2.6c
2.0c (-
23%)
1.0c 0.7c
Ops Fee (1k A / 10k B) 0.5c / 0.4c (-50-60%) 0.5c / 0.4c (-50-60%) 1c / 1c 1c / 5c
Retrieval Fee N/A N/A 1c/GB 5c/GB
Geo Redundancy 2+ Regions 100 mi+ Within a Region Within a Region Within a Region
Designed for Durability 11 9s
Availability SLA 99.95% 99.9% 99% 99%
First byte latency Instant
Min Storage Duration N/A N/A 30 days 90 days
Storage Class Object level storage class and lifecycle (Beta)
39. Standard PD
(Persistent Disk)
SSD PD
(Solid State Drive Persistent Disk)
Local SSD
(Solid State Drive)
Block Storage
Throughput
High IOPS
Low Latency
Low IOPS
High Latency
Streaming IO
Boot Volumes
Bulk Storage
High Perf Scratch
Hadoop
SQL and NoSQL Databases
File Servers
Security
All storage is encrypted over the wire
and at rest
Integrity
Data is stored redundantly, we
checksum all data and take incremental
snapshots (PD only)
Consistency
Performance is consistently high and
pricing does not changes from month
to month
Simplicity
Simple pricing, pay for space only.
Simple configuration no need to create
multiple volumes or manage RAID
arrays.
40. 40
• Fully managed
• Ease of Use
• Highly Reliable
• Flexible Charging
• Security, Availability, Durability
• EU and US Data Centers
• Easy Migration & Data Portability
• Control
Google Cloud SQL
41. 41
• Accessible Anywhere
• Secure Sharing
• Same High Replication Datastore Used
By App Engine Apps Today
• Equally Fast Queries For Any Sized Dataset
• Data is Replicated Across Several Data
Centers
• Use From Any Application or Language
• Serving 4.5 Trillion Requests Per Month
Google Cloud Datastore
Key Features
• Auto-scale
• Schemaless Access
• SQL-like Capabilities
• Authentication That Just Works
• Fast and Easy Provisioning
• RESTful Endpoints
• ACID Transactions
• Local Development Tools
• Built-in Redundancy
43. Products & services Networking
Cloud Virtual
Network
Managed networking
functionality for your
Google Cloud
Platform resources
Cloud Load
Balancing
High performance,
scalable load
balancing on
Google Cloud
Platform
Cloud
Interconnect
Connect your
infrastructure to
Google's network
edge with
enterprise-grade
interconnect
Cloud DNS (Domain
Name System)
Reliable, resilient, low-
latency DNS serving
from Google’s
worldwide network
Cloud CDN
(Content Delivery
Network)
Low-latency, low-cost
content delivery using
Google's global
network
44. FASTER (US, JP, TW)
2016
Unity (US, JP)
2010
SJC (JP, HK, SG)
2013
Edge points of presence >100
Monet (US, BR) 2017
Network sea cable investments
PLCN Unity (HK, LA)
2018
Indigo (SG, ID, AU)
2019
Tannat (BR, UY, AR) 2017
Junior (Rio, Santos)
2017
Google global cache edge nodes (>800)
Google Global Network
Network
46. Carrier Interconnect
Enterprise-grade
connection through the largest
partner network of service
providers
VPN
Secure multi-Gbps
connection over
VPN tunnels
Direct Peering
Private enterprise-grade
connection between you and
Google for your hybrid cloud
workloads
Connect your place to our place. Your way.
48. Products & services Big Data
BigQuery
A fast, economical and fully
managed data warehouse for
large-scale data analytics
Cloud Dataflow
Cloud Dataflow is a real-time data
processing service for batch and
stream data processing
Cloud Dataproc
Cloud Dataproc is a managed
Spark and Hadoop service that
is fast, easy to use, and low
cost
Cloud Pub/Sub
Connect your services with reliable,
many-to-many, asynchronous messaging
hosted on Google's infrastructure
Cloud Datalab BETA
An easy to use interactive tool
for large-scale data exploration,
analysis and visualization
Genomics
Power your science with
Google Genomics
50. Confidential & ProprietaryGoogle Cloud Platform
5
0
What is BigQuery?
• BigQuery is Google’s cloud-based,
enterprise data warehouse.
• BigQuery answers queries of very large
databases, quickly.
• BigQuery scales to petabytes, but is cost-
effective for any organization.
• BigQuery is fully managed. Customers
benefit from almost NoOps.
• BigQuery supports the industry standards,
such as SQL, ODBC, JDBC.
51. Some BigQuery Stats
10.5 Trillion
2.1 petabytes
62 petabytes
4.5 million rows/sec
Largest query (rows)
Largest query (data size)
Largest storage customer
Peak ingestion rate
53. Products & services Management Tools
Stackdriver Overview BETA
Monitoring, logging, and diagnostics
for applications on Google Cloud
Platform and Amazon Web Services
Monitoring BETA
Monitoring for applications running
on Google Cloud Platform and
Amazon Web Services
Logging BETA
Logging for applications running
on Google Cloud Platform and
Amazon Web Services
Error Reporting
Identify and understand
your application errors
Trace
Find performance
bottlenecks in production
Debugger BETA
Investigate your code’s
behavior in production
55. Endpoint checks which measure
the uptime time of your Internet
facing resources.
At-a-glance view of the health of your application
56. Groups: Intelligent aggregations
of resources, such as App
Engine modules, clusters of
Compute Engine VMs and
collections of Cloud SQL
instances.
At-a-glance view of the health of your application
61. Use aggregate
thresholds to set
boundaries
across clusters.
keep an eye on variance by
alerting on a change in
standard deviation for a
metric across a cluster. This
will provide warning when
a cluster is not running
within its normal operating
boundaries.
Alerting
70. In addition to the standard
metrics accessible through
Google Cloud Monitoring, you
can monitor thousands of your
own time series by sending
data to the Cloud Monitoring
API.
72. Configure endpoint checks to test functionality and notify them when their web sites, applications, or APIs
become unavailable to end users.
Endpoints
74. Advanced configuration
options for Host headers,
communication port
and specific text strings
in the content.
Leverage saved
credentials for sites
requiring authentication.
Endpoints
To understand GCP, it’s helpful to go back to the beginning of Google. And that means starting with our mission. We set out with the pretty audacious goal of organizing the world’s information. Google created 7 cloud products each serving over 1 billion users.
3 benefits
When we talk about agility, what we really want as an outcome is to reduce the time it takes for customers to get their solutions to market, and to be able to respond rapidly to change.
When we say speed, we’re talking about both the running of back end services and infrastructure at a high level of performance, and also in the user experience. No one like a slow user experience!
minimize maintain costs , and increase the opportunity to invest.
minimize maintain costs , and increase the opportunity to invest.
Lets start with compute
Lets start with compute
To understand GCP, it’s helpful to go back to the beginning of Google. And that means starting with our mission. We set out with the pretty audacious goal of organizing the world’s information. Google created 7 cloud products each serving over 1 billion users.
3 benefits
When we talk about agility, what we really want as an outcome is to reduce the time it takes for customers to get their solutions to market, and to be able to respond rapidly to change.
When we say speed, we’re talking about both the running of back end services and infrastructure at a high level of performance, and also in the user experience. No one like a slow user experience!
minimize maintain costs , and increase the opportunity to invest.
And Google Cloud is designed to help you deal with those challenges by leveraging our technology, infrastructure, operations, and expertise.
I have to mention, that it would take me days to go through every product and service Google Cloud Platform offers you. That’s because there more than 60 services that make up GCP and the number keeps growing. There are services for everything from core compute to storage and networking. From big data to identity and security.
also GCP has services for IT pros and developers - to help them build the next great app and manage that app in production.
Unique dev tools and OPS tools
innovation items:
Big Data, and the launch of Cloud Dataflow
Kubernetes, Containers, Hybrid
Developer Tools and Debugging/Trace/Monitoring
Super Scaleable Storage options from SSD to Object, and everything in Between
Transparent Maintenance, industry leading local SSD
The most powerful SDN on the planet!!
I’m going to start with security since that’s the foundation of any cloud.
GCP is highly secured.
GCP uses encryption by default, both in transit and at rest.
It starts by requiring all connections to GCP be encrypted. From there, GCP encrypts your data automatically.
Let me move on to Intelligence. Intelligence - how GCP can help you get better understanding of your business
There are many ways GCP can help companies focus on their core competencies by providing systems that intelligently automate much of the tedious work of managing infrastructure. And that has huge implications for every kind of system,
but I’m going to focus on a few APIs that will demonstrate this
If you’ve ever used Google Photos and types or spoke “birthday party” and had it find the pictures of your child’s birthday party, then you have a sense of what Cloud Vision.
Cloud Vision lets you look inside an image and annotate the image with information such as:
How many faces are present? Are they happy or sad?
What commercial logos are present in the image?
What objects or scenes are present? In this case, it will tell you with 95% certainty that that is a leopard.
What famous landmarks are present and it will also give you the latitude and longitude of that location.
What text is present and what language the text is in
Whether or not the image has explicit content
<optional demo>
If you’ve ever said “Ok Google” to your phone or Google Home, you know how good our speech recognition is. Cloud Speech lets you embed that in your applications. It recognizes over 80 languages and variants, everything from Afrikaans and Arabic to Zulu and Icelandic. And it supports both batch uploads of audio as well as real-time streaming and it’s highly accurate in noisy environments.
customer-friendly and help you focus on your business instead of difficult vendor relationship.
GCP gives you opportunity to customise the size of VMs, per minute Billing and automatic discounts.
GCP is designed to be user friendly, you can start using it in minutes.
Google cloud gives any new user 300$ on account to play with GCP
Finally, I’ll talk about openness because openness is a cornerstone of any successful platform.
There are few projects where Google has been an active contributor of code for over a decade.
Here are 6 projects where Google has been an active contributor of code for over a decade.
Whether it’s an OS like Linux, or developer tools like Python, LLVM, and Git.
There is no vendor lock, GCP is a leader in open source
explain each and give example
Lets start with compute
Compute on Cloud Platform -- which you should think about as the the heart of your application, where the code runs -- consists of
As you leave a VM running longer, GCP automatically applies discounts the closer you get to running full time, when you get a 30% discount.
For most customers, this averages out to around a 24% discount from list prices
And you can’t screw it up because it’s automatic. No checkboxes to check, no contracts to sign. It applies automatically.
GCP gives you custom machine types. You can determine how many cores you need and how much RAM, without paying for more than what you need.
On average, customers save around 19% with custom machine types.
Right sizing recommendation, will tell you directly in the GCP console, enabling you to easily resize machines to fit your actual need.
Finally, most cloud providers charge per hour. But GCP lets you pay only for the minutes you use. So if you run a 20 minute batch job, you’ll pay for 20 minutes not a full 60.
Preemptible instances
Preemptible VMs
Up to 80% cheaper than regular VMs.
Very easy to use -- just flip one switch.
Many of our biggest customers run huge clusters (10k+ cores) with great success and savings.
They’re great for batch processing scenarios
Google Compute Engine can preempt (i.e. shutdown or take-away) the VM with 30 seconds of notice.
A container image is a lightweight, stand-alone, executable package of a piece of software that includes everything needed to run it: code, runtime, system tools, system libraries, settings.
Containers are isolated from its surroundings, for example differences between development and staging environments and help reduce conflicts between teams running different software on the same infrastructure.
Anyone working in the world of containers has probably heard of Kubernetes. K8s is an framework for deploying and managing containers in production and it’s based on the over 10 years of experience we have within Google. And we’ve seen a tremendous uptake of K8s across the industry. Key partners like RedHat, IBM, Microsoft, VMWare and others are contributing. And it’s growing every day.
A container image is a lightweight, stand-alone, executable package of a piece of software that includes everything needed to run it: code, runtime, system tools, system libraries, settings.
on wednesday k8s meetup to listen to details
But we understand that customers need choice, so we offer a continuum of compute, from virtualized hardware in compute engine, through containers and up to the top of the stack with App Engine which allows you to build and run PaaS apps.
The reality is that there are many different types of applications in various states of existing and new architectures, so we offer this breadth so customers can adopt in the way that makes best sense for them.
Let’s talk about storage
Google Cloud Platform delivers various storage service offerings which remove much of the burden of building and managing storage a infrastructure. Like our other cloud services, cloud storage will free you to focus on doing what you do best and differentiating at the application or service layer.
Our storage offerings range from SQL, NoSQL, Blob and Block storage depending on what you are trying to do, and it's easy to mix and match.
If you want a disk you can mount Persistent Disk as a block store that can be used by Compute Engine
Cloud Datastore provides a nearly infinitely scalable, schemaless solution (NoSQL)
Cloud SQL gives you fully managed MySQL so you have relational DB and a more traditional approach to queries.
For unstructured data such as files, Cloud Storage is our object store and can affordable, durable, globally available access.
Nearline is a new service option of ‘Cloud Storage’ which is our cloud object store.
….
Talk about price cuts
I’m sorry this is kind of an eye-chart. You can find out more at Google Cloud Platform blog.
Google created three simple product options to help you improve the performance of your applications while keeping your costs low. These three product options use the same API, providing you with a simple and consistent method of access.
Standard Storage offers our highest performance, always available, storage. Simple HTTP-based API accessible from applications written in any modern programming language.
Durable Reduced Availability storage provides a lower cost option that don't require immediate and guaranteed access to storage. Cost savings are made by reducing replicas. Same durability as Cloud Storage.
Nearline: Google Cloud Storage Nearline is a new class of storage service for businesses to easily backup and store limitless amounts of data at a very low cost and access it at any time, in a matter of seconds instead of hours.
DRA
Low-Cost: Comparable price to AWS S3 RRS (Reduced-Redundancy) but with Standard-level durability (11-nines) while RRS is only 99.99% durable.
Trade-off is Availability -- 99% monthly SLA vs. 99.9% for Standard Storage
Equivalent durability and performance as Standard
Ideal for:
“cooler” data -- Data not accessed very often, so no big deal if it takes a couple minutes
Batch data -- E.g. compute jobs that can wait a bit if the data is unavailable
Typical Data Lifecycle:
Start “hot” - popular data that must be highly available
Gets “cool” as it ages - less popular - availability can be relaxed, but durability still critical
So, how do you decide which storage option is right for your app?
Boot volumes and Bulk Storage are great cases for Standard PD because they have low IOPS, low throughput, and basically just want to occupy the cheapest reliable space available.
Streaming IO is a use case that works for Standard PD as well. While the other two options have a much lower cost per IOPS, the cost per throughput is generally better on Standard PD. Reading large sequential blocks is what disks are good at.
SQL databases (like MySQL and Postgres), and NoSQL databases (like Cassandra, Mongo, and Redis) tend to be transactional and IOPS heavy. Smaller instances can run on Standard PD, but important production databases should usually be on some form of SSD - either SSD PD or Local SSD depending on how extreme the IOPS needs.
File servers (like NFS, Gluster, and Ceph), can be more streaming or more transactional depending on what the clients are doing. SSD PD is generally a good choice here.
High performance scratch disk where keeping the traffic local is the right architecture should go on Local SSD
For Hadoop deployments, where IO needs exceed what the Google Cloud Storage Connector for Hadoop can provide, you should use Local SSD underneath a Hadoop optimized filesystem
Google Cloud SQL delivers a relational databases in Google's Cloud. It is a fully managed, highly available relational database. Create, configure, and use MySQL databases that live in Google's Cloud. We manage replication, encryption, patch management, and backups, so you focus on your applications and services.
Easy to use, flexible configuration means you can create, configure, manage, and monitor your database instances with just a couple of clicks in our console or using a command line tool. Manage your instances using any applications and tools you already use with MySQL.
Google Cloud SQL delivers exceptional security and integration with Google Cloud. All data is encrypted when stored and in flight on Google's network. Connections to instances are accepted only from authorized IP addresses, and can be secured with SSL, and MySQL user grants can control access at the database, table, or even column level. All data is replicated multiple times in multiple locations for great durability and availability. Databases can be created in the United States, European Union, or Asia.
Cloud SQL is tightly integrated with Google App Engine, Compute Engine, Cloud Storage, and other Google services, so you can work across multiple products easily, get more value from your data, move your data into and out of the cloud, and get better performance.
Google Cloud Datastore is a fully managed NoSQL Data Storage Service
Google Cloud Datastore is a fully managed, schemaless database for storing non-relational data. Cloud Datastore automatically scales with your users and supports ACID transactions, high availability of reads and writes, strong consistency for reads and ancestor queries, and eventual consistency for all other queries. Cloud DataStore is built on BigTable and offers a persistent storage tier for AppEngine data.
Each Cloud Datastore instance is fully managed by Google so there is no planned downtime, replication across multiple datacenters, automatic scaling as your traffic increases, and monitoring by Google Engineers.
Cloud Datastore is accessible via HTTP using a JSON or Protocol Buffers API, running on top of the Google APIs infrastructure. Cloud Datastore offers Protocol Buffer client libraries for Java and Python as well as support for the Google APIs client libraries. In addition, Cloud Datastore offers a web-based interface for managing your Cloud Datastore instances, and a development server to support local development.
Unlike traditional relational databases, the Datastore uses a distributed architecture to automatically manage scaling to very large data sets. While the Datastore interface has many of the same features as traditional databases, it differs from them in the way it describes relationships between data objects. Entities of the same kind can have different properties, and different entities can have properties with the same name but different value types.
These unique characteristics imply a different way of designing and managing data to take advantage of the ability to scale automatically. In particular, the Datastore differs from a traditional relational database in the following important ways:
The Datastore is designed to scale, allowing applications to maintain high performance as they receive more traffic:
Datastore writes scale by automatically distributing data as necessary.
Datastore reads scale because the only queries supported are those whose performance scales with the size of the result set (as opposed to the data set). This means that a query whose result set contains 100 entities performs the same whether it searches over a hundred entities or a million. This property is the key reason some types of queries are not supported.
Because all queries are served by pre-built indexes, the types of queries that can be executed are more restrictive than those allowed on a relational database with SQL. In particular, the following are not supported:
Join operations
Inequality filtering on multiple properties
Filtering of data based on results of a subquery
Unlike traditional relational databases, the Datastore doesn't require entities of the same kind to have a consistent property set (although you can choose to enforce such a requirement in your own application code).
Lets start with networking
Edge Point of Presence
So how exactly does Google do all this. Well, let me start at a critical piece that many people overlook.
Google Cloud doesn’t run in one datacenter or even on one continent. It’s a collection of regions across the world. So securing a system like this mean more than just securing one datacenter - it means securing the connections between them.
Edge Point of Presence!
We’ve been investing in our own private network connections across the globe for more than a decade. So when you use Cloud Storage to globally replicate data, that data isn’t transmitted over the public internet. We even go so far as to build and operate our own submarine cables connecting the US to Europe and Asia. And our newest cable connects South East Asia with Australia. All this not only makes our network higher quality and more reliable, it makes it more secure.
We also extend this to points of presence and edge cache locations around the world. So when your users connect to your app running in GCP, they are able to connect to a location closest to them, often in the same city. And from there, that traffic is delivered to the right region over Google’s network. Similarly, when you serve traffic, we are able to carry that traffic to the location nearest the user instead of just dumping it to an unknown internet provider.
Carrier interconnect provides enterprise-grade connections to google, using interconnect carrier service provider partners
Direct peering allows you to connect directly to Google for high levels of traffic
And of course secure VPN connections over the internet.
Now, let’s talk a bit about big data
But this time, we aren’t just producing research papers. We’re giving customers direct access to the software systems we built for ourselves. So instead of waiting up to a decade for the these innovations to work there way to the public, you can use them today.
<brief description of a couple of the various services>
And to give you a sense of what’s possible with these tools - here are some fun stats for BigQuery.