Spanner is Google's globally distributed database that provides synchronous replication across data centers for strong consistency. It uses TrueTime to synchronize clocks across data centers and provide a consistent view of data to users. The architecture of Spanner involves splitting tables into shards called "splits" that are replicated across multiple zones for high availability. Transactions in Spanner are globally consistent yet remain highly available and partition tolerant, making Spanner a CA (Consistent and Available) system according to the CAP theorem.
Paper review of fast depth.
"FastDepth: Fast Monocular Depth Estimation on Embedded Systems", Diana Wofk.
MobileNet + DeConv + Compile optimization + Pruning.
Copyright for 6DVision corporation.
Paper review of fast depth.
"FastDepth: Fast Monocular Depth Estimation on Embedded Systems", Diana Wofk.
MobileNet + DeConv + Compile optimization + Pruning.
Copyright for 6DVision corporation.
ScyllaDB Open Source 5.0 is the latest evolution of our monstrously fast and scalable NoSQL database – powering instantaneous experiences with massive distributed datasets.
Join us to learn about ScyllaDB Open Source 5.0, which represents the first milestone in ScyllaDB V. ScyllaDB 5.0 introduces a host of functional, performance and stability improvements that resolve longstanding challenges of legacy NoSQL databases.
We’ll cover:
- New capabilities including a new IO model and scheduler, Raft-based schema updates, automated tombstone garbage collection, optimized reverse queries, and support for the latest AWS EC2 instances
- How ScyllaDB 5.0 fits into the evolution of ScyllaDB – and what to expect next
- The first look at benchmarks that quantify the impact of ScyllaDB 5.0's numerous optimizations
This will be an interactive session with ample time for Q & A – bring us your questions and feedback!
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...Imperva Incapsula
Mondrian, MySQL, Mongo, Casandra, Lucene. You name it, we tried it. As a startup looking for cost-efficient and scalable solutions to power our event processing and statistics backend, we gave almost every Big Data technology out there a go. What we learned from these experiences is that doing it yourself is better than using plug-and-play black box solutions.
This presentation details the building of Incapsula’s Big Data system as a case study, examining the requirements and the different evolutionary phases it went through before becoming what it is today.
ScyllaDB V Developer Deep Dive Series: Resiliency and Strong Consistency via ...ScyllaDB
ScyllaDB’s implementation of the Raft consensus protocol translates to strong, immediately consistent schema updates, topology changes, tables and indexes, and more. This eliminates schema and data conflicts, enables rapid and safe increases in cluster capacity, and provides a leap forward in manageability. Join this webinar to learn how the Raft consensus algorithm has been implemented, what you can do with it today, and what radical new capabilities it will enable in the days ahead.
Azure Cosmos DB - NoSQL Strikes Back (An introduction to the dark side of you...Andre Essing
A long time ago in a database far, far away...
SQL was the only option to save vast amounts of application data for a long period of time. There were always some rebellion activities, to overcome the SQL Empire, which brought a new hope, but all other ways of storing data were never more than a phantom menace.
Now Cosmos DB awakens and is ready for the revenge of the NoSQL.
During this talk, we will have a look at what Azure Cosmos DB is, what you can achieve with its possibilities and how to use it in a galactic environment of data and applications.
Join me and find your way to the right solution for your application.
May the data be with you!
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...Andrew Liu
Data analysts, data engineers, and application developers are supporting unprecedented rates of change, whether talking about latency requirements to the expanding arena of data usage scenarios. While the technology functionality must rapidly evolve to meet customer needs and respond to competitive pressures, how can we enhance the data platform to help manage this unpredictability?
To help address these realities, data practitioners from a diverse set of backgrounds are increasingly relying on schema-free, distributed, scalable, and high-performance data storage (also known as NoSQL databases). In this session, we will showcase a wide variety of customer scenarios, business goals, and technical challenges faced by real-world customers. More importantly, how adding Azure DocumentDB into a data practitioner's arsenal within the Microsoft/Azure data ecosystem will allow you to easily solve these complex design patterns at massive scale.
NoSQL Strikes Back (An introduction to the dark side of your data)
A long time ago in a database far, far away...
SQL was the only option to save vast amounts of application data for a long period of time. There were always some rebellion activities, to overcome the SQL Empire, which brought a new hope, but all other ways of storing data were never more than a phantom menace.
Now Cosmos DB awakens and is ready for the revenge of the NoSQL.
During this talk, we will have a look at what Azure Cosmos DB is, what you can achieve with its possibilities and how to use it in a galactic environment of data and applications.
Join me and find your way to the right solution for your application.
May the data be with you!
Video: https://youtu.be/LuVT0jsIrZk
------------------------------------------------------------------------------------------------------------------------------------
Hay trabajos y hay carreras. Las oportunidades vienen a golpear la puerta cuando menos lo esperas. La decisión es tuya. Desde tener la oportunidad de hacer algo significativo día tras día, hasta estar rodeado de gente supremamente inteligente y motivada.
¿Estás listo?
Descúbre todas nuestras oportunidades acá:https://mycareer.globant.com/
------------------------------------------------------------------------------------------------------------------------------------
Siguenos en:
Facebook: https://www.facebook.com/Globant/
Twitter: https://twitter.com/Globant
Instagram: https://www.instagram.com/globantpics/
Linkedin: https://www.linkedin.com/company/globant/
Presenter: Kenn Knowles, Software Engineer, Google & Apache Beam (incubating) PPMC member
Apache Beam (incubating) is a programming model and library for unified batch & streaming big data processing. This talk will cover the Beam programming model broadly, including its origin story and vision for the future. We will dig into how Beam separates concerns for authors of streaming data processing pipelines, isolating what you want to compute from where your data is distributed in time and when you want to produce output. Time permitting, we might dive deeper into what goes into building a Beam runner, for example atop Apache Apex.
Management and Automation of MongoDB Clusters - SlidesSeveralnines
Use MongoDB at Any Scale
As you scale, one of the challenges is optimizing your clusters and mitigating operational risk. Proper preparation can result in significant savings and reduced downtime.
This session covers:
* Deployment of dev/test/production environments across private data centers or public clouds
* What to monitor in production environments
* Management automation with ClusterControl from Severalnines
* How ClusterControl works with TokuMX
The session will give you the tools to more effectively manage your cluster, immediately. The presentation will include code samples and a live Q&A session.
This webinar is being delivered jointly by Severalnines & Tokutek. Severalnines provides automation and management tools to reduce the complexity of working with highly available database clusters. Tokutek provides high-performance and scalability for MongoDB, MySQL and MariaDB.
Flink at netflix paypal speaker seriesMonal Daxini
* Over 100 million subscribers from over 190 countries enjoy the Netflix service. This leads to over a trillion events flowing through the Keystone stream processing infrastructure to help glean business insights and improve customer experience. The self-serve infrastructure enables the users to focus on extracting insights, and not worry about building out scalable infrastructure. I’ll share our experience building building this platform with Flink, and lessons learnt.
Strata Conference + Hadoop World NY 2016: Lessons learned building a scalable...Sumeet Singh
Building a real-time monitoring service that handles millions of custom events per second while satisfying complex rules, varied throughput requirements, and numerous dimensions simultaneously is a complex endeavor. Sumeet Singh and Mridul Jain explain how Yahoo approached these challenges with Apache Storm Trident, Kafka, HBase, and OpenTSDB and discuss the lessons learned along the way.
Sumeet and Mridul explain scaling patterns backed by real scenarios and data to help attendees develop their own architectures and strategies for dealing with the scale challenges that come with real-time big data systems. They also explore the tradeoffs made in catering to a diverse set of daily users and the associated usability challenges that motivated Yahoo to build a self-serve, easy-to-use platform that requires minimal programming experience. Sumeet and Mridul then discuss event-level tracking for debugging and troubleshooting problems that our users may encounter at this scale. Over the course of their talk, they also address building infrastructure and operational intelligence with anomaly detection, alert correlation, and trend analysis based on the monitoring platform.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
More Related Content
Similar to Spanner : Google' s Globally Distributed Database
ScyllaDB Open Source 5.0 is the latest evolution of our monstrously fast and scalable NoSQL database – powering instantaneous experiences with massive distributed datasets.
Join us to learn about ScyllaDB Open Source 5.0, which represents the first milestone in ScyllaDB V. ScyllaDB 5.0 introduces a host of functional, performance and stability improvements that resolve longstanding challenges of legacy NoSQL databases.
We’ll cover:
- New capabilities including a new IO model and scheduler, Raft-based schema updates, automated tombstone garbage collection, optimized reverse queries, and support for the latest AWS EC2 instances
- How ScyllaDB 5.0 fits into the evolution of ScyllaDB – and what to expect next
- The first look at benchmarks that quantify the impact of ScyllaDB 5.0's numerous optimizations
This will be an interactive session with ample time for Q & A – bring us your questions and feedback!
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...Imperva Incapsula
Mondrian, MySQL, Mongo, Casandra, Lucene. You name it, we tried it. As a startup looking for cost-efficient and scalable solutions to power our event processing and statistics backend, we gave almost every Big Data technology out there a go. What we learned from these experiences is that doing it yourself is better than using plug-and-play black box solutions.
This presentation details the building of Incapsula’s Big Data system as a case study, examining the requirements and the different evolutionary phases it went through before becoming what it is today.
ScyllaDB V Developer Deep Dive Series: Resiliency and Strong Consistency via ...ScyllaDB
ScyllaDB’s implementation of the Raft consensus protocol translates to strong, immediately consistent schema updates, topology changes, tables and indexes, and more. This eliminates schema and data conflicts, enables rapid and safe increases in cluster capacity, and provides a leap forward in manageability. Join this webinar to learn how the Raft consensus algorithm has been implemented, what you can do with it today, and what radical new capabilities it will enable in the days ahead.
Azure Cosmos DB - NoSQL Strikes Back (An introduction to the dark side of you...Andre Essing
A long time ago in a database far, far away...
SQL was the only option to save vast amounts of application data for a long period of time. There were always some rebellion activities, to overcome the SQL Empire, which brought a new hope, but all other ways of storing data were never more than a phantom menace.
Now Cosmos DB awakens and is ready for the revenge of the NoSQL.
During this talk, we will have a look at what Azure Cosmos DB is, what you can achieve with its possibilities and how to use it in a galactic environment of data and applications.
Join me and find your way to the right solution for your application.
May the data be with you!
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...Andrew Liu
Data analysts, data engineers, and application developers are supporting unprecedented rates of change, whether talking about latency requirements to the expanding arena of data usage scenarios. While the technology functionality must rapidly evolve to meet customer needs and respond to competitive pressures, how can we enhance the data platform to help manage this unpredictability?
To help address these realities, data practitioners from a diverse set of backgrounds are increasingly relying on schema-free, distributed, scalable, and high-performance data storage (also known as NoSQL databases). In this session, we will showcase a wide variety of customer scenarios, business goals, and technical challenges faced by real-world customers. More importantly, how adding Azure DocumentDB into a data practitioner's arsenal within the Microsoft/Azure data ecosystem will allow you to easily solve these complex design patterns at massive scale.
NoSQL Strikes Back (An introduction to the dark side of your data)
A long time ago in a database far, far away...
SQL was the only option to save vast amounts of application data for a long period of time. There were always some rebellion activities, to overcome the SQL Empire, which brought a new hope, but all other ways of storing data were never more than a phantom menace.
Now Cosmos DB awakens and is ready for the revenge of the NoSQL.
During this talk, we will have a look at what Azure Cosmos DB is, what you can achieve with its possibilities and how to use it in a galactic environment of data and applications.
Join me and find your way to the right solution for your application.
May the data be with you!
Video: https://youtu.be/LuVT0jsIrZk
------------------------------------------------------------------------------------------------------------------------------------
Hay trabajos y hay carreras. Las oportunidades vienen a golpear la puerta cuando menos lo esperas. La decisión es tuya. Desde tener la oportunidad de hacer algo significativo día tras día, hasta estar rodeado de gente supremamente inteligente y motivada.
¿Estás listo?
Descúbre todas nuestras oportunidades acá:https://mycareer.globant.com/
------------------------------------------------------------------------------------------------------------------------------------
Siguenos en:
Facebook: https://www.facebook.com/Globant/
Twitter: https://twitter.com/Globant
Instagram: https://www.instagram.com/globantpics/
Linkedin: https://www.linkedin.com/company/globant/
Presenter: Kenn Knowles, Software Engineer, Google & Apache Beam (incubating) PPMC member
Apache Beam (incubating) is a programming model and library for unified batch & streaming big data processing. This talk will cover the Beam programming model broadly, including its origin story and vision for the future. We will dig into how Beam separates concerns for authors of streaming data processing pipelines, isolating what you want to compute from where your data is distributed in time and when you want to produce output. Time permitting, we might dive deeper into what goes into building a Beam runner, for example atop Apache Apex.
Management and Automation of MongoDB Clusters - SlidesSeveralnines
Use MongoDB at Any Scale
As you scale, one of the challenges is optimizing your clusters and mitigating operational risk. Proper preparation can result in significant savings and reduced downtime.
This session covers:
* Deployment of dev/test/production environments across private data centers or public clouds
* What to monitor in production environments
* Management automation with ClusterControl from Severalnines
* How ClusterControl works with TokuMX
The session will give you the tools to more effectively manage your cluster, immediately. The presentation will include code samples and a live Q&A session.
This webinar is being delivered jointly by Severalnines & Tokutek. Severalnines provides automation and management tools to reduce the complexity of working with highly available database clusters. Tokutek provides high-performance and scalability for MongoDB, MySQL and MariaDB.
Flink at netflix paypal speaker seriesMonal Daxini
* Over 100 million subscribers from over 190 countries enjoy the Netflix service. This leads to over a trillion events flowing through the Keystone stream processing infrastructure to help glean business insights and improve customer experience. The self-serve infrastructure enables the users to focus on extracting insights, and not worry about building out scalable infrastructure. I’ll share our experience building building this platform with Flink, and lessons learnt.
Strata Conference + Hadoop World NY 2016: Lessons learned building a scalable...Sumeet Singh
Building a real-time monitoring service that handles millions of custom events per second while satisfying complex rules, varied throughput requirements, and numerous dimensions simultaneously is a complex endeavor. Sumeet Singh and Mridul Jain explain how Yahoo approached these challenges with Apache Storm Trident, Kafka, HBase, and OpenTSDB and discuss the lessons learned along the way.
Sumeet and Mridul explain scaling patterns backed by real scenarios and data to help attendees develop their own architectures and strategies for dealing with the scale challenges that come with real-time big data systems. They also explore the tradeoffs made in catering to a diverse set of daily users and the associated usability challenges that motivated Yahoo to build a self-serve, easy-to-use platform that requires minimal programming experience. Sumeet and Mridul then discuss event-level tracking for debugging and troubleshooting problems that our users may encounter at this scale. Over the course of their talk, they also address building infrastructure and operational intelligence with anomaly detection, alert correlation, and trend analysis based on the monitoring platform.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
3. Annual revenue of Google from 2005 to 2018 (in
Billion U.S dollars)
Revenue in billion
U.S. dollars
6.1
Sources: Google; Statista 2019
2005
29.3
2010
74.54
2015
136.22
2018
3
Background
5. Globally
distributed
Fully managed,
database service
with global scale
Traditional
relational
semantics:
Schemas, ACID
transaction, SQL
Semi -Relational
Database
Synchronously
replicated
Automatic,
synchronous
replication within
and across regions
for availability
What is Spanner?
5
6. Architecture OverViewComputeStorage
DB 1
DB n
DB 1
DB n
DB 1
DB n
Zone 1 Zone 2 Zone 3
Regional Instance
6
Sources: Robert K.
Spanner - a fully managed horizontally
scalable relational database...
7. Architecture Overview
DB 1
DB 2
DB 3
DB 4
DB 5
DB n
Instance
Split 1
Split 2
Split 3
Split 4
Split 5
Split 6
Split 7
Split n
Table 1
Table 2
Table 3
Table 4
Table 5
Table 6
Table 7
Table n
Zone
7
Sources: Robert K.
Spanner - a fully managed horizontally
scalable relational database...
8. Architecture OverView
Split 1
Zone 1
ComputeStorage
Split 2
Split 3
Paxos
Group
for Split
1
Split 1
Zone 1
Split 2
Split 3
Split 1
Zone 1
Split 2
Split 3
*TrueTime used for leader leases : There is only one leader for a split at any given time
Regional Instance
8
Sources: Robert K.
Spanner - a fully managed horizontally
scalable relational database...
9. • Invented during the creation of Spanner
• Quantifies the « worst » possible error / drift between clocks in all
datacenters arount the world (global clock)
• TrueTime.now() gives you an interval [t1,t2]; t2 = t1+ 2Є
• t1 is guaranteed to be lower than the value of the global clock at
the instant when Now() finishes executing
• T2 is guaranteed to be higher than the value of the global clock at
the instant when Now() starts executing
9
Architecture OverView – True Time
10. 10
Architecture OverView – True Time
Computer NodeComputer NodeComputer Node
Atomic MasterAtomic Master Atomic Master
GPS Master GPS Master GPS Master
Sync every 30 sec
Sync every 30 sec, Synchonization within 50µs, ε guaranteed interval around 2ms
11. Life of query : Consistent Read
Split 1
Zone 1
Split 2
Split 3
Split 1
Zone 1
Split 2
Split 3
Split 1
Zone 1
Split 2
Split 3
11
Sources: Robert K.
Spanner - a fully managed horizontally
scalable relational database...
slave leader slave
4. Wait for data /
Response
1. Request
3 No/ Yes
2. Ok
to read
12. Life of query : Stale Read
Split 1
Zone 1
Split 2
Split 3
Split 1
Zone 1
Split 2
Split 3
Split 1
Zone 1
Split 2
Split 3
12
Sources: Robert K.
Spanner - a fully managed horizontally
scalable relational database...
slave leader slave
4. Wait for data /
Response
1. Request
(max 15s old )
2. Am I up-
to-date
13. Life of query : Read
Split 1
Zone 1
Split 2
Split 3
Split 1
Zone 1
Split 2
Split 3
Split 1
Zone 1
Split 2
Split 3
13
Sources: Robert K.
Spanner - a fully managed horizontally
scalable relational database...
slave
leader slave
3. Query result
4. txn. Bufferwrite
1. txn. Query()
2. acq. locks
5. Write 5. Write
6. ack6. ack 7. Rel. locks
15. Spanner claims to be consistent and available
CA CP
AP
It is impossible for a
distributed computer
system to simultaneously
provide more than two
out of three of the
following guarantees:
Consistency, Availability,
Partition Tolerance
15
Always able
to read and
Write
Always see the
same data as
others at same
point in time
Works even
in the case
of network
partition
Pick Two !!!
16. Summary
Read / Write Transactions
What Google Spanner is,
the idea behind and and
what offers as database
The Architecture behind
Google Spanner
16
Spanner a CA/CP
system
17. David F. Bacon et al. 2017. Spanner: Becoming a SQL System.
In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD '17). ACM,
New York, NY, USA, 331-343. DOI: https://doi.org/10.1145/3035918.3056103
1
James C. Corbett et al. 2012. Spanner: Google's globally-distributed database
In Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
(OSDI'12). USENIX Association, Berkeley, CA, USA, 251-264.
2
Jeff Shute et al. 2013. F1: a distributed SQL database that scales.
Proc. VLDB Endow. 6, 11 (August 2013), 1068-1079. DOI: http://dx.doi.org/10.14778/2536222.2536232
3
Eric B. 2017. Spanner, TrueTime & The CAP Theorem
VP, Infrastructure, Google. February 14, 2017 :https://ai.google/research/pubs/pub45855
4
Robert K. 2017. Spanner - a fully managed horizontally scalable relational database
DEVOXX Poland (June 2017) [Video], https://www.youtube.com/watch?v=IFbydfGV2lQ
5
References
17