A fotopedia presentation made at the MongoDay 2012 in Paris at Xebia Office.
Talk by Pierre Baillet and Mathieu Poumeyrol.
French Article about the presentation:
http://www.touilleur-express.fr/2012/02/06/mongodb-retour-sur-experience-chez-fotopedia/
Video to come.
Rapid Development and Performance By Transitioning from RDBMSs to MongoDB
Modern day application requirements demand rich & dynamic data structures, fast response times, easy scaling, and low TCO to match the rapidly changing customer & business requirements plus the powerful programming languages used in today's software landscape.
Traditional approaches to solutions development with RDBMSs increasingly expose the gap between the modern development languages and the relational data model, and between scaling up vs. scaling horizontally on commodity hardware. Development time is wasted as the bulk of the work has shifted from adding business features to struggling with the RDBMSs.
MongoDB, the premier NoSQL database, offers a flexible and scalable solution to focus on quickly adding business value again.
In this session, we will provide:
- Overview of MongoDB's capabilities
- Code-level exploration of the MongoDB programming model and APIs and how they transform the way developers interact with a database
- Update of the exciting features in MongoDB 3.0
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...MongoDB
This session will be a case study of eBay’s experience running MongoDB for project Zoom, in which eBay stores all media metadata for the site. This includes references to pictures of every item for sale on eBay. This cluster is eBay's first MongoDB installation on the platform and is a mission critical application. Yuri Finkelstein, an Enterprise Architect on the team, will provide a technical overview of the project and its underlying architecture.
A Presentation on MongoDB Introduction - HabilelabsHabilelabs
It is Scalable High-Performance Open-source, Document-orientated database.
Built for Speed - the performance of traditional key-value stores while maintaining functionality of traditional RDBMS.
Rapid Development and Performance By Transitioning from RDBMSs to MongoDB
Modern day application requirements demand rich & dynamic data structures, fast response times, easy scaling, and low TCO to match the rapidly changing customer & business requirements plus the powerful programming languages used in today's software landscape.
Traditional approaches to solutions development with RDBMSs increasingly expose the gap between the modern development languages and the relational data model, and between scaling up vs. scaling horizontally on commodity hardware. Development time is wasted as the bulk of the work has shifted from adding business features to struggling with the RDBMSs.
MongoDB, the premier NoSQL database, offers a flexible and scalable solution to focus on quickly adding business value again.
In this session, we will provide:
- Overview of MongoDB's capabilities
- Code-level exploration of the MongoDB programming model and APIs and how they transform the way developers interact with a database
- Update of the exciting features in MongoDB 3.0
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...MongoDB
This session will be a case study of eBay’s experience running MongoDB for project Zoom, in which eBay stores all media metadata for the site. This includes references to pictures of every item for sale on eBay. This cluster is eBay's first MongoDB installation on the platform and is a mission critical application. Yuri Finkelstein, an Enterprise Architect on the team, will provide a technical overview of the project and its underlying architecture.
A Presentation on MongoDB Introduction - HabilelabsHabilelabs
It is Scalable High-Performance Open-source, Document-orientated database.
Built for Speed - the performance of traditional key-value stores while maintaining functionality of traditional RDBMS.
NoSQL datastores fall under the following categories: Key-value stores, document databases, column-family stores and graph databases. The traditional TPC-* tests are not sufficient for these heterogeneous database systems. MongoDB, CouchDB, Cassandra, HBase, Memcaches etc belong to one of 4 families and a common workload can be generated by ycsb to simulate your usecase and benchmark them.
New generations of database technologies are allowing organizations to build applications never before possible, at a speed and scale that were previously unimaginable. MongoDB is the fastest growing database on the planet, and the new 3.2 release will bring the benefits of modern database architectures to an ever broader range of applications and users.
Webinar: Faster Big Data Analytics with MongoDBMongoDB
Learn how to leverage MongoDB and Big Data technologies to derive rich business insight and build high performance business intelligence platforms. This presentation includes:
- Uncovering Opportunities with Big Data analytics
- Challenges of real-time data processing
- Best practices for performance optimization
- Real world case study
This presentation was given in partnership with CIGNEX Datamatics.
Security is more critical than ever with new computing environments in the cloud and expanding access to the internet. There are a number of security protection mechanisms available for MongoDB to ensure you have a stable and secure architecture for your deployment. We'll walk through general security threats to databases and specifically how they can be mitigated for MongoDB deployments. Topics will include general security tools and how to configure those for MongoDB, an overview of security features available in MongoDB, including LDAP, SSL, x.509 and Authentication.
AWS is an incredibly popular environment for running MongoDB deployments. Today you have many choices about instance type, storage, network config, security, how you configure MongoDB processes, and more. In addition, you now have options when it comes to tooling to help you manage and operate your deployment. In this session, we’ll take a look at several recommendations that can help you get the best performance out of AWS.
Webinar: An Enterprise Architect’s View of MongoDBMongoDB
In the world of big data, legacy modernization, siloed organizations, empowered customers, and mobile devices, making informed choices about your enterprise infrastructure has become more important than ever. The alternatives are abundant, and the successful Enterprise Architect must constantly discern which new technology is just a shiny object and which will add true business value.
MongoDB is more than just a great application database for developers; it gives Enterprise Architects new capabilities to solve previously difficult architectural requirements much more easily. Take for example the challenge of many siloed systems at MetLife – with MongoDB, the Metlife team was able to successfully provide a single view into those 70 systems, in only 3 months.
In this webinar, we will:
Explore real life challenges enterprises face with case studies of their solutions
Consider how best to introduce MongoDB in the enterprise
Give an overview of how to optimize the use of MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDBMongoDB
Presented by Achille Brighton, Principal Consulting Engineer, MongoDB
Experience level: Deep dive
MongoDB 3.2 brings major enhancements. New pluggable storage engines optimized for in-memory computing and the most security-sensitive applications. Simplified data governance with document validation, coupled with GUI-based schema discovery and visualization. Improved operational efficiency with enhanced management platforms, continuous uptime across distributed, multi-region deployments, and zero-downtime upgrades. To take advantage of these features, your team needs an upgrade plan. In this session, we’ll walk you through how to build an upgrade plan. We’ll show you how to validate your existing deployment, build a test environment with a representative workload, and detail how to carry out the upgrade. You’ll walk away confident that you're prepared to upgrade.
MongoDB and RDBMS: Using Polyglot Persistence at Equifax MongoDB
MongoDB and RDBMS: Using Polyglot Persistence at Equifax. Presented by Michael Lawrence, Pariveda Solutions on behalf of Equifax at MongoDB Evenings Atlanta on September 24, 2015.
Webinar: High Performance MongoDB Applications with IBM POWER8MongoDB
Innovative companies are building Internet of Things, mobile, content management, single view, and big data apps on top of MongoDB. In this session, we'll explore how the IBM POWER8 platform brings new levels of performance and ease of configuration to these solutions which already benefit from easier and faster design and development using MongoDB.
Relational databases are used extensively in many applications and systems, but they are not always the best data store solution to the problem at hand. In this session we discuss the limitations of RDBMS and show which NoSQL solutions can be used to overcome these limitations. We also cover migration topics, such as how to add NoSQL databases without adding complexity to your development and operations.
Pros and Cons of MongoDB in Web DevelopmentNirvana Canada
Databases are available in plenty, and choosing the right one for your organization is a challenging task. In this blog, we will specifically focus on MongoDB and its pros and cons for web development.
Management and Automation of MongoDB Clusters - SlidesSeveralnines
Use MongoDB at Any Scale
As you scale, one of the challenges is optimizing your clusters and mitigating operational risk. Proper preparation can result in significant savings and reduced downtime.
This session covers:
* Deployment of dev/test/production environments across private data centers or public clouds
* What to monitor in production environments
* Management automation with ClusterControl from Severalnines
* How ClusterControl works with TokuMX
The session will give you the tools to more effectively manage your cluster, immediately. The presentation will include code samples and a live Q&A session.
This webinar is being delivered jointly by Severalnines & Tokutek. Severalnines provides automation and management tools to reduce the complexity of working with highly available database clusters. Tokutek provides high-performance and scalability for MongoDB, MySQL and MariaDB.
NoSQL datastores fall under the following categories: Key-value stores, document databases, column-family stores and graph databases. The traditional TPC-* tests are not sufficient for these heterogeneous database systems. MongoDB, CouchDB, Cassandra, HBase, Memcaches etc belong to one of 4 families and a common workload can be generated by ycsb to simulate your usecase and benchmark them.
New generations of database technologies are allowing organizations to build applications never before possible, at a speed and scale that were previously unimaginable. MongoDB is the fastest growing database on the planet, and the new 3.2 release will bring the benefits of modern database architectures to an ever broader range of applications and users.
Webinar: Faster Big Data Analytics with MongoDBMongoDB
Learn how to leverage MongoDB and Big Data technologies to derive rich business insight and build high performance business intelligence platforms. This presentation includes:
- Uncovering Opportunities with Big Data analytics
- Challenges of real-time data processing
- Best practices for performance optimization
- Real world case study
This presentation was given in partnership with CIGNEX Datamatics.
Security is more critical than ever with new computing environments in the cloud and expanding access to the internet. There are a number of security protection mechanisms available for MongoDB to ensure you have a stable and secure architecture for your deployment. We'll walk through general security threats to databases and specifically how they can be mitigated for MongoDB deployments. Topics will include general security tools and how to configure those for MongoDB, an overview of security features available in MongoDB, including LDAP, SSL, x.509 and Authentication.
AWS is an incredibly popular environment for running MongoDB deployments. Today you have many choices about instance type, storage, network config, security, how you configure MongoDB processes, and more. In addition, you now have options when it comes to tooling to help you manage and operate your deployment. In this session, we’ll take a look at several recommendations that can help you get the best performance out of AWS.
Webinar: An Enterprise Architect’s View of MongoDBMongoDB
In the world of big data, legacy modernization, siloed organizations, empowered customers, and mobile devices, making informed choices about your enterprise infrastructure has become more important than ever. The alternatives are abundant, and the successful Enterprise Architect must constantly discern which new technology is just a shiny object and which will add true business value.
MongoDB is more than just a great application database for developers; it gives Enterprise Architects new capabilities to solve previously difficult architectural requirements much more easily. Take for example the challenge of many siloed systems at MetLife – with MongoDB, the Metlife team was able to successfully provide a single view into those 70 systems, in only 3 months.
In this webinar, we will:
Explore real life challenges enterprises face with case studies of their solutions
Consider how best to introduce MongoDB in the enterprise
Give an overview of how to optimize the use of MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDBMongoDB
Presented by Achille Brighton, Principal Consulting Engineer, MongoDB
Experience level: Deep dive
MongoDB 3.2 brings major enhancements. New pluggable storage engines optimized for in-memory computing and the most security-sensitive applications. Simplified data governance with document validation, coupled with GUI-based schema discovery and visualization. Improved operational efficiency with enhanced management platforms, continuous uptime across distributed, multi-region deployments, and zero-downtime upgrades. To take advantage of these features, your team needs an upgrade plan. In this session, we’ll walk you through how to build an upgrade plan. We’ll show you how to validate your existing deployment, build a test environment with a representative workload, and detail how to carry out the upgrade. You’ll walk away confident that you're prepared to upgrade.
MongoDB and RDBMS: Using Polyglot Persistence at Equifax MongoDB
MongoDB and RDBMS: Using Polyglot Persistence at Equifax. Presented by Michael Lawrence, Pariveda Solutions on behalf of Equifax at MongoDB Evenings Atlanta on September 24, 2015.
Webinar: High Performance MongoDB Applications with IBM POWER8MongoDB
Innovative companies are building Internet of Things, mobile, content management, single view, and big data apps on top of MongoDB. In this session, we'll explore how the IBM POWER8 platform brings new levels of performance and ease of configuration to these solutions which already benefit from easier and faster design and development using MongoDB.
Relational databases are used extensively in many applications and systems, but they are not always the best data store solution to the problem at hand. In this session we discuss the limitations of RDBMS and show which NoSQL solutions can be used to overcome these limitations. We also cover migration topics, such as how to add NoSQL databases without adding complexity to your development and operations.
Pros and Cons of MongoDB in Web DevelopmentNirvana Canada
Databases are available in plenty, and choosing the right one for your organization is a challenging task. In this blog, we will specifically focus on MongoDB and its pros and cons for web development.
Management and Automation of MongoDB Clusters - SlidesSeveralnines
Use MongoDB at Any Scale
As you scale, one of the challenges is optimizing your clusters and mitigating operational risk. Proper preparation can result in significant savings and reduced downtime.
This session covers:
* Deployment of dev/test/production environments across private data centers or public clouds
* What to monitor in production environments
* Management automation with ClusterControl from Severalnines
* How ClusterControl works with TokuMX
The session will give you the tools to more effectively manage your cluster, immediately. The presentation will include code samples and a live Q&A session.
This webinar is being delivered jointly by Severalnines & Tokutek. Severalnines provides automation and management tools to reduce the complexity of working with highly available database clusters. Tokutek provides high-performance and scalability for MongoDB, MySQL and MariaDB.
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
Jan 22nd, 2010 Hadoop meetup presentation on project voldemort and how it plays well with Hadoop at linkedin. The talk focus on Linkedin Hadoop ecosystem. How linkedin manage complex workflows, data ETL , data storage and online serving of 100GB to TB of data.
UnConference for Georgia Southern Computer Science March 31, 2015Christopher Curtin
I presented to the Georgia Southern Computer Science ACM group. Rather than one topic for 90 minutes, I decided to do an UnConference. I presented them a list of 8-9 topics, let them vote on what to talk about, then repeated.
Each presentation was ~8 minutes, (Except Career) and was by no means an attempt to explain the full concept or technology. Only to wake up their interest.
Laine Campbell, CEO of Blackbird, will explain the options for running MySQL at high volumes at Amazon Web Services, exploring options around database as a service, hosted instances/storages and all appropriate availability, performance and provisioning considerations using real-world examples from Call of Duty, Obama for America and many more. Laine will show how to build highly available, manageable and performant MySQL environments that scale in AWS—how to maintain then, grow them and deal with failure. Some of the specific topics covered are:
* Overview of RDS and EC2 – pros, cons and usage patterns/antipatterns.
* Implementation choices in both offerings: instance sizing, ephemeral SSDs, EBS, provisioned IOPS and advanced techniques (RAID, mixed storage environments, etc…)
* Leveraging regions and availability zones for availability, business continuity and disaster recovery.
* Scaling patterns including read/write splitting, read distribution, functional dataset partitioning and horizontal dataset partitioning (aka sharding)
* Common failure modes – AZ and Region failures, EBS corruption, EBS performance inconsistencies and more.
* Managing and mitigating cost with various instance and storage options
Using MongoDB to Build a Fast and Scalable Content RepositoryMongoDB
Presented by Mike Obrebski, Senior Solution Architect, Nuxeo
MongoDB can be used in the Nuxeo Platform as a replacement for traditional SQL databases. Nuxeo's content repository, which is the cornerstone of this open source software platform, can now completely rely on MongoDB for data storage. This presentation will explain the motivation for using MongoDB and will discuss different implementation strategies. In this session, you will learn more about the migrations to MongoDB and how we were able to achieve increased performance gains.
A presentation on the selection criteria, testing + evaluation and successful, zero-downtime migration to MongoDB. Additionally details on Wordnik's speed and stability are covered as well as how NoSQL technologies have changed the way Wordnik scales.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Epistemic Interaction - tuning interfaces to provide information for AI support
MongoDB vs Mysql. A devops point of view
1. MongoDB vs MySQL
A DevOps point of view.
Pierre Baillet <oct@fotopedia.com> @octplane
Mathieu Poumeyrol <kali@fotopedia.com>
2. Summary
«The question is, which is to be master» Humpty Dumpty
Who we are
Context and constraints
High availability and day-to-day operations
Scalability
3. Who we are
Fotopedia (and not photopedia or fotolia)
Created in 2006, Paris based
around 20 people, some Apple ex-employees
Pictures for humanity, cross-breed between flickr and
wikipedia
4. What we do
Website http://www.fotopedia.com (100% free website,
become member and show us your best photos)
2011 Crunchies award for Best Tablet Application
5. Some statistics
150 Millions photos views
1 MySQL database
4 MongoDB ‘clusters’ (spread on 4 servers)
Around 500GB of structured data
6. Context and Constraints
«La nuit ne peut qu'empirer mille fois» Roméo & Juliette
24/7 Website and web-services
Continuous deployment
Current Infrastructure
What we expect from our NoSQL DBMS and our
compromise
7. 24/7
Several million of users around the world. Between 300
and several thousands at once connected.
Using either the website or one of the 7 available iOS
applications. Business Critical
When the website is down at the application level,
everything starts to fail gradually
We cannot stop the website completely. Ever.
9. Continuous deployment
Git-based development flow with several active
branches
development branch deployed every wednesday
an average of 3 minor hot-fixes every workday
agile: any developer can push its hot fixs in production,
at any time
We cannot easily schedule migrations. They should be
as transparent as possible.
11. Software Stack
RoR Website
Multiple OpenSource software used in the web stack:
HAProxy, Nginx, Unicorn, mongo-resque, ...
Some well known NoSQL tools:
MySQL used to manage what was the core of our
data
MongoDB in production since September 2009, now
managing more than 70% of our data
12. Monitoring tools
Munin, Nagios
Custom log feeder build around MongoDB (cf
slideshare presentation: "mongodb as a log collector")
MongoDB is also used to store slow transactions,
exceptions and profiling traces for later inspection
13. Hosting Platform: 100% AWS
Instances are not highly reliable
but they are both abundant and disposable
Disk is abundant and disposable too
Use AWS RDS for MySQL hosting. Cheap and easy to
setup but very shaky failover process (DNS based).
We cannot rely too much on the hardware
14. What we expect from our NoSQL DBMS and our
compromise
No downtime:
High availability
No migration cost
Easy to deploy, redeploy, replicate, reconfigure
Quietly losing seconds of writes is preferable to
weekly minutes-long maintenances periods
minutes-long unscheduled downtime and manual
failover in case of hardware failure
15. High Availability,
Day to Day operations
«Au fond de la cave, Paraît qu'il y a pas de sots métiers» Le poinçonneur de lilas
Development environment
Operations cycle
Fit for the DevOps
17. Dev’ Data Locality
In MongoDB, a collection will typically replace 2 or 3
SQL tables
The physical proximity, locality, enables faster, simpler
and more complete data retrieval from the application
point of view. Less requests, more data.
18. Dev’ Data Migration: ALTER
ALTER TABLE is nightmarish
leads to various forms of model abusing strategy:
reuse of fields
flag fields (binary encoded), blob fields (json/xml
encoded), ...
MongoDB solution: free form data storage, extensible.
19. Defensive strategy
Application code aware of possible inconsistencies:
gracefully failing view layer
self-healing data access layer
routine data checking and fixing batch
20. Dev’ Data Migration: INDICES
Indices creation leads to table-wide lock in MySQL.
Renders part of the Cluster unavailable
MongoDB solution: Background indices creation, slows
access a tiny bit, but do not lock !
21. Dev’ Backup/Restore
MongoDB ability to dump a db/collection empowers
developer
Possible to restore part of the production dataset
simply on a development box
Backup a MongoDB by collections in S3, recover on
dev’ platform in a matter of minutes
22. Ops Cycle
MongoDB, small is beautiful
Cornerstone: the Replica Set
High availability
Backup and data import/export
Hardware migration
23. Ops, MongoDB, small and beautiful
Young software, relatively compact (around 150,000 of
C++ code)
Builds out of the box on modern distributions
Distros Package made by 10gen
Drivers for most popular languages are also provided
and maintained by 10gen staff. (although quality varies)
24. Ops, Replica Set
A set of machine sharing the same data
Only one Primary, several Secondaries
All writes go to Primary, routed to secondaries.
Reads can be routed to primary or secondary at the
application choice
With the combination of AWS, Replica Set are very
powerful. MongoDB Loves the Cloud !
25. Ops, Master/Slave reloaded
Client libraries are replica set aware
connect to any node(s), the configuration and current
layout is discovered
Database semantics are preserved
Incredibly easy to setup
Priority between nodes can be dynamically changed
It’s possible to prevent a node from ever becoming
master (slow-disk server used as a «hot backup»)
26. Ops, High Availability Strengths
Primary step down can be triggered. Lead to election
of a new Primary.
a new Primary is picked when the Primary becomes
unreachable
clients will transparently connect to the new Primary
MongoDB Arbiter ensure split brain will not happen
Config. Server contains the sharding information. 1 or 3
config servers with internal failover mechanism
27. Ops, High Availability compromise
Switch over will take 20 to 25 seconds
Some queries in the interval may crash
Some writes may reach a split primary
28. Ops, Backup and exports
Stop the secondary and do whatever you need done.
Easy to backup a single collection or a whole database
As a matter of fact, we just dumbly «mongodump»
every collection of interest separately.
29. Ops, Hardware migration
Optionally possible to «preload» with a FS or block level
snapshot
Add the brand new node to the replica set
Wait for synchro
Change RS rules to get your new server primary
Remove the old hardware
30. Fit for the DevOps
In the modern sense of DevOps, MongoDB provides
the Agility and Ease of use required
It provides working tools for developers
And is much more confortable than MySQL in its daily
usage
Truly a DevOps-friendly tool.
31. Scalability
«Accroche toi au pinceau, j’enlève le shell.» Entendu @fotopedia
Cloud Limitations
Sharding and Replica Set
Performance
Reading
Writing
Storage
Scalable from the ground up
32. Cloud Limitations
Virtual Hardware
Neighbors can eat all you I/O
No precise control and overview of this situation
Largest VM cannot compare to largest Metal
Hardware issue means zero-notice before instance
retirement (Metal has same issue though). Need to be
flexible
Scaling-out is the way to scale on the Cloud
33. Sharding
Use a business key to part your data
Each shard is typically a replica set
Access is provided via the MongoS servers
Configuration is stored and managed in the Config
servers
34. Reading
Without Sharding
Reading is performed on a master by default to
perserve read-your-own-writes. Can be
programmatically allowed on a slave.
To scale up reads, add Replica Set nodes
With Sharding
Reading is performed in parallel across data nodes
To scale up reads: ensure most queries will reach
only one single shard
35. Writing
Writes are always performed on the Primary node, so
replica set does not help.
Sharding distributes the write among the cluster
36. Storage
Replica Set and shards can be moved on as many
servers as needed.
To get more space
scale up by migrating your Replica Set to bigger
hardware
scale out by sharding existing the collection
37. Scalable from the ground up
MongoDB is scalable as soon as you need it to.
No complex configuration for replication
Beautiful ability to handle replica set and shards out of
the box
MongoS / Config Server / Shards allows more complex
setup
Cloud Friendly