SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
26.
Things of things </li></ul><li>Have to map between the two </li></ul>
27.
Data Mapping Woes (2) <ul><li>Data in systems evolve over time … which means changes to the schema.
28.
Upgrade/rollback scripts have to operate on the whole database – could be millions of rows.
29.
Doing phased rollouts is hard … the application needs to do work </li></ul>
30.
Alternative! <ul><li>Let the application do it
31.
Use convenient language features </li><ul><li>PHP serialize/unserialize </li></ul><li>… or use standards for mixed platforms </li><ul><li>JSON very popular and well supported
35.
Scalability and Availability <ul><li>Scalability </li><ul><li>How many requests you can process </li></ul><li>Availability </li><ul><li>How does your service degrade as things break. </li></ul><li>RDBMS solutions - replication and sharding </li></ul>
36.
Scaling RDBMs - Replication <ul><li>Master-Slave replication is easiest
37.
Every change on the master happens on the slave.
38.
Slaves are read-only. Does not scale INSERT, UPDATE, DELETE queries.
39.
Application responsible for distributing queries to correct server. </li></ul>
40.
Scaling RDBMs - Replication <ul><li>Multi-master ring replication </li><ul><li>Can update any master
44.
Replication <ul><li>Replication is usually asynchronous for performance – you don't want to wait for the slowest slave on each update.
45.
Replication takes time – there is time lag between the first and last server to see an update.
46.
You may not read your writes – not getting aCid properties any more. </li></ul>
47.
Scaling RDBMS – Sharding <ul><li>Do application level splitting of data </li><ul><li>Split large table into N smaller tables
48.
Use Id modulo N to find the right table </li></ul><li>Tables could be spread across multiple database servers </li><ul><li>But the application needs to know where to query </li></ul></ul>
49.
Availability <ul><li>If you want availability you need multiple servers – maybe even multiple sites.
50.
In the real world you get network partitions </li><ul><li>Just because you can't see your other data center doesn't mean users can't. </li></ul><li>What should you do if you can't see the other data center? </li></ul>
51.
Availability <ul><li>Degrade one site to read-only </li><ul><li>Defeats availability </li></ul><li>If you allow both sites to operate </li><ul><li>There's a chance two users could modify the same data.
52.
The application needs to know how to resolve it </li></ul></ul>
53.
The bottom line... <ul><li>Building systems that are </li><ul><li>...Scalable...
60.
App needs to handle inconsistency </li></ul><li>Work for operational staff </li><ul><li>Fixing replication topologies and synchronizing servers is fiddly work. </li></ul></ul>
61.
Last decades bleeding edge is here <ul><li>Organizations with big problems started experimenting with alternatives
62.
Developed internal systems during the mid 2000s </li><ul><li>Distributed by design
63.
Different data models </li></ul><li>Published details in 2006/2007 </li></ul>