Solution Architect, MongoDB
Massimo Brignoli
#MongoDBBasics
‘Build an Application’Webinar Series
Deploying your applicatio...
Agenda
• Replica Sets Lifecycle
• Developing with Replica Sets
• Scaling your database
Q&A
• Virtual Genius Bar
– Use chat to post questions
– EMEASolution
Architecture / Support
Team are on hand
– Make use of...
Recap
• Introduction to MongoDB
• Schema design
• Interacting with the database
• Indexing
• Analytics
– Map Reduce
– Aggr...
Deployment Considerations
Working Set Exceeds Physical
Memory
Why Replication?
• How many have faced node failures?
• How many have been woken up from sleep to do a
fail-over(s)?
• How...
Replica Set Lifestyle
Replica Set – Creation
Replica Set – Initialize
Replica Set – Failure
Replica Set – Failover
Replica Set – Recovery
Replica Set – Recovered
Developing with
Replica Sets
Strong Consistency
Delayed Consistency
Write Concern
• Network acknowledgement
• Wait for error
• Wait for journal sync
• Wait for replication
Unacknowledged
MongoDB Acknowledged (wait for
error)
Wait for Journal Sync
Wait for Replication
Tagging
• Control where data is written to, and read from
• Each member can have one or more tags
– tags: {dc: "ny"}
– tag...
{
_id : "mySet",
members : [
{_id : 0, host : "A", tags : {"dc": "ny"}},
{_id : 1, host : "B", tags : {"dc": "ny"}},
{_id ...
Wait for Replication (Tagging)
Read Preference Modes
• 5 modes
– primary (only) - Default
– primaryPreferred
– secondary
– secondaryPreferred
– Nearest
W...
Tagged Read Preference
• Custom read preferences
• Control where you read from by (node) tags
– E.g. { "disk": "ssd", "use...
• SAFE writes acceptable for our use case
• Potential to use secondary reads for
comments, but probably not needed
• Use t...
Scaling
Working Set Exceeds Physical
Memory
• When a specific resource becomes a bottle
neck on a machine or replica set
• RAM
• Disk IO
• Storage
• Concurrency
When ...
Vertical Scalability (Scale Up)
Horizontal Scalability (Scale Out)
Partitioning
• User defines shard key
• Shard key defines range of data
• Key space is like points on a line
• Range is a ...
Initially 1 chunk
Default max chunk size: 64mb
MongoDB automatically splits & migrates chunks
when max reached
Data Distri...
Architecture
What is a Shard?
• Shard is a node of the cluster
• Shard can be a single mongod or a replica set
Meta Data Storage
• Config Server
– Stores cluster chunk ranges and locations
– Can have only 1 or 3 (production must have...
Routing and Managing Data
• Mongos
– Acts as a router / balancer
– No local data (persists to config database)
– Can have ...
Sharding infrastructure
Cluster Request Routing
• Targeted Queries
• Scatter Gather Queries
• Scatter Gather Queries with Sort
Cluster Request Routing: Targeted
Query
Routable request received
Request routed to appropriate
shard
Shard returns results
Mongos returns results to client
Cluster Request Routing: Non-Targeted
Query
Non-Targeted Request Received
Request sent to all shards
Shards return results to mongos
Mongos returns results to client
Cluster Request Routing: Non-Targeted
Query with Sort
Non-Targeted request with sort
received
Request sent to all shards
Query and sort performed locally
Shards return results to mongos
Mongos merges sorted results
Mongos returns results to client
Shard Key
Shard Key
• Shard key is immutable
• Shard key values are immutable
• Shard key must be indexed
• Shard key limited to 512...
A suitable shard key for our app…
• Occurs in most queries
• Routes to each shard
• Is granular enough to not exceed 64MB ...
Summary
Things to remember
• Size appropriately for your working set
• Shard when you need to, not before
• Pick a shard key wisely
Next Session – 10th June
• Backup and Disaster Recovery
• Backup and restore options
Thank you
Upcoming SlideShare
Loading in …5
×

Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installare l’applicazione

478 views

Published on

Fai del 2014 l'anno in cui imparare qualcosa di nuovo. Unisciti alla nostra serie di webinar in 8 parti e scopri quanto è facile sviluppare applicazioni con MongoDB. Le sessioni, tenute dai nostri Solutions Architects, vi insegneranno le basi dalla A alla Z, condivideranno le best practice e i trucchi per partire con confidenza. Le sessioni saranno completamente in italiano.

A questo punto avremo fatto l’applicazione. Ora dobbiamo metterla in produzione. Illustreremo le varie architetture per l’alta affidabilità e per la scalabilità orizzontale.

La serie comprende le seguenti sessioni:

10 Giugno 2014 Serie Operazioni per la vostra applicazione - Sessione 7 - Backup e Disaster Recovery:
Questo webinar parlerà delle varie opzioni di backup e di restore. Impara cosa dovresti fare in caso di un guasto e come effettuare le operazioni di backup e recovery dai dati nelle vostre applicazioni.

17 Giugno 2014: Serie Operazioni per la vostra applicazione - Sessione 8 - Monitoraggio e Performance Tuning:
L’ultimo webinar della serie discuterà quali metriche sono importanti e come gestire e monitorare la vostra applicazione per migliorare le performance.

Massimo Brignoli: About the speaker

Massimo ha 44 anni e vive a Milano. Ha lavorato nell’IT per 23 anni per aziende di trasporti, società web e database company. Nel 1998 è entrato una una piccola startup come sviluppatore aiutandola a diventare il più importante portale web italiano, venduto 3 anni più tardi per 700 milioni di dollari. E’ entrato a lavorare in MySQL come pre-vendita viaggiando in tutto il mondo e aiutando le società telecom ad adottare MySQL Cluster. Nel 2012 è entrato in SkySQL come product manager, seguendo l’integrazione con MariaDB e successivamente ha deciso di entrare in MongoDB per seguire nuove sfide professionali. Attualmente e’ Senior Solutions Architect.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
478
On SlideShare
0
From Embeds
0
Number of Embeds
96
Actions
Shares
0
Downloads
15
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Initialize -> Election
    Primary + data replication from primary to secondary
  • Primary down/network failure
    Automatic election of new primary if majority exists
  • New primary elected
    Replication established from new primary
  • Down node comes up
    Rejoins sets
    Recovery and then secondary
  • Consistency
    Write preferences
    Read preferences
  • Not really fire and forget.

    This return arrow is to confirm that the network successfully transferred the packet(s) of data.

    This confirms that the TCP ACK response was received.
  • The mongos does not have to load the whole set into memory since each shard sorts locally. The mongos can just getMore from the shards as needed and incrementally return the results to the client.
  • _id could be unique across shards if used as shard key.
    we could only guarantee uniqueness of (any) attributes if the keys are used as shard keys with unique attribute equals true
  • Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installare l’applicazione

    1. 1. Solution Architect, MongoDB Massimo Brignoli #MongoDBBasics ‘Build an Application’Webinar Series Deploying your application in production
    2. 2. Agenda • Replica Sets Lifecycle • Developing with Replica Sets • Scaling your database
    3. 3. Q&A • Virtual Genius Bar – Use chat to post questions – EMEASolution Architecture / Support Team are on hand – Make use of them during the sessions!!!
    4. 4. Recap • Introduction to MongoDB • Schema design • Interacting with the database • Indexing • Analytics – Map Reduce – Aggregation Framework
    5. 5. Deployment Considerations
    6. 6. Working Set Exceeds Physical Memory
    7. 7. Why Replication? • How many have faced node failures? • How many have been woken up from sleep to do a fail-over(s)? • How many have experienced issues due to network latency? • Different uses for data – Normal processing – Simple analytics
    8. 8. Replica Set Lifestyle
    9. 9. Replica Set – Creation
    10. 10. Replica Set – Initialize
    11. 11. Replica Set – Failure
    12. 12. Replica Set – Failover
    13. 13. Replica Set – Recovery
    14. 14. Replica Set – Recovered
    15. 15. Developing with Replica Sets
    16. 16. Strong Consistency
    17. 17. Delayed Consistency
    18. 18. Write Concern • Network acknowledgement • Wait for error • Wait for journal sync • Wait for replication
    19. 19. Unacknowledged
    20. 20. MongoDB Acknowledged (wait for error)
    21. 21. Wait for Journal Sync
    22. 22. Wait for Replication
    23. 23. Tagging • Control where data is written to, and read from • Each member can have one or more tags – tags: {dc: "ny"} – tags: {dc: "ny", subnet: "192.168", rack: "row3rk7"} • Replica set defines rules for write concerns • Rules can change without changing app code
    24. 24. { _id : "mySet", members : [ {_id : 0, host : "A", tags : {"dc": "ny"}}, {_id : 1, host : "B", tags : {"dc": "ny"}}, {_id : 2, host : "C", tags : {"dc": "sf"}}, {_id : 3, host : "D", tags : {"dc": "sf"}}, {_id : 4, host : "E", tags : {"dc": "cloud"}}], settings : { getLastErrorModes : { allDCs : {"dc" : 3}, someDCs : {"dc" : 2}} } } > db.blogs.insert({...}) > db.runCommand({getLastError : 1, w : "someDCs"}) Tagging Example
    25. 25. Wait for Replication (Tagging)
    26. 26. Read Preference Modes • 5 modes – primary (only) - Default – primaryPreferred – secondary – secondaryPreferred – Nearest When more than one node is possible, closest node is used for reads (all modes but primary)
    27. 27. Tagged Read Preference • Custom read preferences • Control where you read from by (node) tags – E.g. { "disk": "ssd", "use": "reporting" } • Use in conjunction with standard read preferences – Except primary
    28. 28. • SAFE writes acceptable for our use case • Potential to use secondary reads for comments, but probably not needed • Use tagged reads for analytics Our application
    29. 29. Scaling
    30. 30. Working Set Exceeds Physical Memory
    31. 31. • When a specific resource becomes a bottle neck on a machine or replica set • RAM • Disk IO • Storage • Concurrency When to consider Sharding?
    32. 32. Vertical Scalability (Scale Up)
    33. 33. Horizontal Scalability (Scale Out)
    34. 34. Partitioning • User defines shard key • Shard key defines range of data • Key space is like points on a line • Range is a segment of that line
    35. 35. Initially 1 chunk Default max chunk size: 64mb MongoDB automatically splits & migrates chunks when max reached Data Distribution
    36. 36. Architecture
    37. 37. What is a Shard? • Shard is a node of the cluster • Shard can be a single mongod or a replica set
    38. 38. Meta Data Storage • Config Server – Stores cluster chunk ranges and locations – Can have only 1 or 3 (production must have 3) – Not a replica set
    39. 39. Routing and Managing Data • Mongos – Acts as a router / balancer – No local data (persists to config database) – Can have 1 or many
    40. 40. Sharding infrastructure
    41. 41. Cluster Request Routing • Targeted Queries • Scatter Gather Queries • Scatter Gather Queries with Sort
    42. 42. Cluster Request Routing: Targeted Query
    43. 43. Routable request received
    44. 44. Request routed to appropriate shard
    45. 45. Shard returns results
    46. 46. Mongos returns results to client
    47. 47. Cluster Request Routing: Non-Targeted Query
    48. 48. Non-Targeted Request Received
    49. 49. Request sent to all shards
    50. 50. Shards return results to mongos
    51. 51. Mongos returns results to client
    52. 52. Cluster Request Routing: Non-Targeted Query with Sort
    53. 53. Non-Targeted request with sort received
    54. 54. Request sent to all shards
    55. 55. Query and sort performed locally
    56. 56. Shards return results to mongos
    57. 57. Mongos merges sorted results
    58. 58. Mongos returns results to client
    59. 59. Shard Key
    60. 60. Shard Key • Shard key is immutable • Shard key values are immutable • Shard key must be indexed • Shard key limited to 512 bytes in size • Shard key used to route queries – Choose a field commonly used in queries • Only shard key can be unique across shards – `_id` field is only unique within individual shard
    61. 61. A suitable shard key for our app… • Occurs in most queries • Routes to each shard • Is granular enough to not exceed 64MB chunks • Any candidates? – Author? – Date? – _id? – Title? – Author & Date?
    62. 62. Summary
    63. 63. Things to remember • Size appropriately for your working set • Shard when you need to, not before • Pick a shard key wisely
    64. 64. Next Session – 10th June • Backup and Disaster Recovery • Backup and restore options
    65. 65. Thank you

    ×