Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Current 2022
We see data and the way organizations using it to create unique customer experiences is being completely reimagined. Every time a customer clicks, types or swipes data is in constant motion spanning systems, environments and applications. This in turn requires business’ manage complex integrations, synchronizations and processing of data spread across cloud and on-prem environments.
To accelerate time to value, data needs to be seamlessly ingested, integrated and/or replicated to a cloud environment to take advantage of its analytical, BI and AI use cases. Google Cloud delivers a simplified approach for all your Data Movement needs through a highly differentiated product portfolio.
In this session, learn how organizations can unlock data value using best-in-class, cloud native products on Google Cloud and its partners such as Confluent.
2. Elena Cuevas
Manager, Cloud Partner Solutions Engineering
Confluent
Siddharth Desai
Partner Engineer, Data Engineering & Analytics ISVs
Google
3. 2/3 of Data Produced is
NEVER Analyzed
Closing the Data Value Gap
181 ZB exp. in 3 Years
10X in last 8 Years
Data Value
68% of companies are unable to realize tangible &
measurable Value from Data.
4. The innovation leader in applying Data and AI to real-world situations
Search
Search ranking
Speech recognition
Self Driving Car
20B miles driven
Translate
Text, graphic and
speech translations
Smarter & Cleaner
Infrastructure
2X more efficient
Photos
Photos search
AlphaGo
First AI to beat a world
Go champion (2016)
Gmail
Smart reply
Spam classification
YouTube
Video
recommendations
Better thumbnails
5. Deliver serverless
analytics, not
infrastructure
Build for enterprise
at any scale
Embed ML and
drive an end-to-
end lifecycle
Empower analytics
across the entire
data lifecycle
Enable the best
OSS technologies
Google Cloud is significantly simplifying big data analytics
6. A comprehensive data analytics platform
Data ingestion
at any scale
Reliable streaming data
pipeline
Advanced
analytics
Data warehousing
and data lake
Cloud
Pub/Sub
Data Transfer
Service
Cloud IoT
Core
Storage
Transfer Service
Cloud
Dataflow
Cloud
Dataproc
Cloud
Dataprep
Apache
Beam
BigQuery Cloud
Storage
Cloud AI
Services
Google Data
Studio
Tensorflow Sheets
Looker
Confluent
7. Real-time insights
from streaming data
Google
BigQuery
Google Cloud’s enterprise data
warehouse for analytics
Gigabyte to petabyte scale
storage and SQL queries
Encrypted, durable,
And highly available
Fully managed and serverless
for maximum agility and scale
UNIQUE
UNIQUE
Built-in ML for out-of-the-box
predictive insights
High-speed, in-memory
BI Engine for faster reporting
and analysis
UNIQUE
UNIQUE
Analyze and Visualize
Geospatial data
UNIQUE
8. BigQuery: architecture
Serverless. Decoupled storage and compute for maximum flexibility.
SQL:2011
Compliant
REST API
Web UI, CLI
Client libraries
In 7 languages
Streaming ingest
Free bulk
loading
Identity management
Distributed Memory
Shuffle Tier
BigQuery
Google Cloud
Security
Petabit network
Hardware infrastructure
Collect Process Activate
Store Analyze Empower
Replicated,
Distributed Storage
(99.9999999999%)
High-Available
Cluster Compute
(Dremel)
Confluent
9. Limitless data scale
…BigQuery continued to
scale in storage,
compute, concurrence,
ingest and reliability as
we added more and
more users, traffic, and
data.”
Nikhil Mishra
Sr. Director of Engineering, Verizon Media
275EB
analyzed across BigQuery
in December 2021
110+TB
data analyzed per second
10. Looker Data Platform
In-database architecture | Semantic modelling layer | API-first & Cloud native
Serve up real-time, relevant
reports and dashboards that
act as starting points for
more in-depth analysis
Modern BI &
Analytics
Infuse relevant information
into the tools and products
people already use
Integrated
Insights
Super-charge operational
workflows with complete,
near-real time data
Data-driven
Workflow
Build purpose-specific tools to
deliver data in an experience
tailored to the job
Custom
Applications
11. Tee up real-time insights to your business
Build the foundation for ML and AI
Simply data operations at scale
Build your
own models
your data + your model
Train our
state-of-the-art models
your data + our model
Call our
perception APIs
our data + our model
Comprehensive,
real-time insights
Governance &
standards
maintenance
Integration with
existing tools
Customer
Network
Cost & Operations
Sales & Marketing
Operations
Data sources
Ads GA 360
EDW /
HDFS
Salesforce GMP Flat Files
API SAP Youtube Social
Media
Data types
Build your
own models
your data + your model
Train our
state-of-the-art models
your data + our model
Call our
perception APIs
our data + our model
Comprehensive,
real-time insights
Governance &
standards
maintenance
Integration with
existing tools
Data Lifecycle on the Google Cloud
Looker
Intelligently scale analytics
BigQuery
Integrated ML Multi-cloud
Query
acceleration
Geospatial
Ve ex AI
Data ow
Unified batch
and streaming
Real-time
templates
Con uent (Tech Pa ner)
12. Data Solutions on GCP
Marketing Analytics Streaming Analytics EDW Modernization BI Modernization with
Looker
Cortex/SAP DW
Modernization
Combine marketing
data with customer
provided data on
BigQuery with Looker to
gain insights and
improve sales &
marketing
effectiveness.
Provide a simple, fast,
and operationally
sustainable solution to
stream data into
BigQuery and derive
Analytics for a variety of
workloads and industry
use cases.
Modernize and migrate
your existing data
warehouse from
Teradata, Netezza,
Exadata, Oracle, SQL,
Redshift, Snowflake or
Synapse environments
to BigQuery.
Modernization and
consolidation of
customers legacy BI
environment with an
enterprise BI platform
Modernization of
complex, high-volume
data needs essential for
Google Cloud Cortex
Framework.
Premier data integration partner across solutions
13. Consumer VPC
Producer VPC
Producer VPC
Producer VPC
Private Service Connect support for
Confluent Cloud
Cloud SQL
Confluent
Borg
BigQuery
1st Party
Google Services
3rd Party
Services
● A private IP address in the consumer’s
VPC for every managed service
instance.
● Network-to-service (many-to-one)
connectivity
● Service-oriented security model
● Fewer shared constraints and no IP
coordination required
15. Today’s customers expect data in motion
Initiate an action
Instant confirmation
Source: https://www.gigaspaces.com/blog/amazon-found-every-100ms-of-latency-cost-them-1-in-sales
~100 ms
in latency can
cost you…
… 20% of
digital traffic
… $400M in
revenue
16. A new paradigm is required for Data in Motion
Real-time &
Historical
Data
A sale
A shipment
A trade
A customer interaction
Real-Time Stream Processing
Rich, front-end
customer experiences
Real-time, software-driven
business operations
17. Kafka is the de facto standard for Data in Motion
Finance & Banking Insurance Telecom
Travel & Retail
10 OUT
OF 10 8 OUT
OF 8
Fortune 500 Companies
Using Apache Kafka
70+%
Transportation Energy & Utilities Entertainment
Technology
8 OUT
OF 10 9 OUT
OF 10
10 OUT
OF 10
10 OUT
OF 10
10 OUT
OF 10 8 OUT
OF 10
18. Operationalizing Kafka on your own is difficult
“We are in the business of selling and renting clothes. We are not in the
business of managing an event streaming platform… If we had to
manage everything ourselves, I would’ve had to hire at least 10 more
people to keep the systems up and running."
● Architecture planning
● Cluster sizing
● Cluster provisioning
● Broker settings
● Zookeeper management
● Partition placement & data
durability
● Source/sink connectors
development & maintenance
● Monitoring & reporting tools
setup
● Software patches and upgrades
● Security controls and
integrations
● Failover design & planning
● Mirroring & geo-replication
● Streaming data governance
● Load rebalancing & monitoring
● Expansion planning & execution
● Utilization optimization &
visibility
● Cluster migrations
● Infrastructure & performance
upgrades / enhancements
I N V E S T M E N T & T I M E
V
A
L
U
E
1
2
3
4
5
Experimentation /
Early Interest
Central Nervous
System
Mission critical,
disparate LOBs
Identify a
Project
Mission-critical,
connected LOBs
Key challenges:
Operational burden & resources
Manage and scale platform to support ever-growing
demand
Security & governance
Ensure streaming data is as safe & secure as
data-at-rest as Kafka usage scales
Real-time connectivity & processing
Leverage valuable legacy data to power modern,
cloud-based apps & experiences
Global availability
Maintain high availability across environments with
minimal downtime
20. Cloud-Native
Apache Kafka®
, fully managed and re-architected to harness the power of the cloud
Serverless
● Elastic scaling up & down
from 0 to GBps
● Auto capacity mgmt, load
balancing, and upgrades
High Availability
● 99.99% SLA
● Multi-region / AZ availability across
cloud providers
● Patches deployed in Confluent
Cloud before Apache Kafka
Infinite Storage
● Store data cost-
effectively at any scale
without growing
compute
DevOps Automation
● API-driven and/or
point-and-click ops
● Service portability &
consistency across cloud
providers and on-prem
Network
Flexibility
● Public, VPC, and
Private Link
● Self-managed option
for air-gapped
environments
Elastic: Instantly scale to meet any demand
Seamlessly provision and deploy fully managed, elastically
scaling clusters with infinite storage that expand & shrink to
cost-effectively support all streaming use cases
Reliable: Power all your streaming apps & analytics
with resilience
Maintain high availability of your clusters and data streams
with our 99.99% uptime SLA, multi-zones / region clusters,
and no-touch Kafka patches & upgrades
Agile: Focus on innovation, not infrastructure
Fully automate management of serverless clusters through
code via Terraform integration and REST APIs, paying only for
what you use when you use it
21. Complete
Go above & beyond Kafka with all the essential tools for a complete data streaming platform
Connectors: Connect to and from any app & system
Integrate with all the most popular sources and sinks using
our portfolio of 120+ pre-built connectors
Stream Processing: Quickly build and deploy
streaming apps & pipelines
Enrich your data streams with stream processing powered by
ksqlDB, a declarative approach using lightweight SQL syntax
to abstract away low-level ops
Security & Governance: Secure, discover, and
organize your data streams
Build trust and put your data streams to work with
enterprise-grade security and the only stream
governance suite for data in motion
Connectors
Security
Data
Governance
Stream
Processing
Monitoring
Global
Resilience
22. Everywhere
Connect your data in real time with a platform that spans from
on-prem to cloud and across clouds
Run Anywhere: Deploy across any environment
Provision Confluent as a fully managed service on Google
Cloud across multiple regions w/ Confluent Cloud, or
on-premises w/ Confluent Platform
Unified: Unify data across hybrid and multi-clouds
Provide consistent, self-service access to real-time data
across all your environments with Cluster Linking and
globally connected clusters that perfectly mirror data
Consistent: Learn one platform for all environments
Remove the burden of learning new tools for each
environment with a consistent experience spanning across
cloud, on-prem, and hybrid / multi-cloud
22
24. Confluent is a bridge to Google Cloud
Google Cloud
On-Prem
Oracle HDFS Teradata,
Netezza
Mainframe
Dataflow
BigQuery
Cloud
Storage
Data
Studio
Cloud
Functions
AI
Platform
Bigtable
Dataproc
Cloud Run
KSQL
App
Confluent Platform Confluent Cluster Linking
OSS Kafka Confluent Cluster Linking
Oracle HDFS Teradata,
Netezza
Mainframe
App
Cloud Apps
3rd party data
Additional CSPs
Confluent Cluster Linking
25. Google + Confluent Solutions
Real Time Analytics
ML/AI, Customer 360
Hybrid / Multi-Cloud
Multi-cloud Architectures, Mainframe Modernization
Application Modernization
DW/DB Modernization, Micro-services
OSS to Managed Kafka
26. Cloud Networking Scenarios
Access from
● On premise
○ Colocation
○ Data center
Access across
● Regions in same
cloud
Access for
● Corporate o ce
● Home o ce
● 3rd pa y provider
● Field gateway
Access between
● Multiple clouds
Access over
● Public or Private
connectivity
27. Private Service Connect on Confluent Cloud
No coordination of IP address space and
routing
Private endpoints simplify previous complex
network topology
Unidirectional service consumption
Traffic flows securely over GCP backbone
network
Confluent Recommended
Recommended and preferred private
connectivity pattern to Confluent Cloud
28. Why use Private Service Connect on Google Cloud?
Connection reuse and zone selection
One-time setup for multiple Kafka clusters within a Confluent Cloud
network with zone selection for high-availability.
Simplified network topology
Increased deployment agility and scalability compared to other
private connectivity options
Focus on application development
Reduce time spent managing resources and debugging connectivity with
private endpoints and self-service provisioning
Omar, the Operator
Amy, the Architect
David, the Developer
29. Challenge: Replace a batch-oriented, database-centric legacy
data platform that was slowing the business and becoming
increasingly difficult to scale
Solution: Work with Confluent and Google to set data in
motion with a cloud-based, real-time streaming architecture
Results:
● Accelerated business transformation by more than a year
● Cut administrative costs
● Leveraged more internal expertise
● Eliminated hour-long latencies
“Confluent and Google Cloud enabled us to address our large
database footprint and retire our legacy data platform, which
was in many ways our Achilles’ heel. After moving to real-time
streaming on a cloud-based modern architecture, we can now
deliver new features and capabilities to our customers and
know that they won’t be slowed by an outdated architecture.”
— Jason Mattioda, Rodan + Fields
32. Next Steps
Ensure to visit the Google Cloud or
Con uent booths to here at Current
to learn more about how you can
transform you data landscape
Talk to your Con uent and Google
contacts about how Private Service
Connect can help you secure your
workloads on GCP
Register and Join us at Google Next
2022 (next week!)
Try Con uent Cloud for free via the
GCP Marketplace