MongoDB is a NoSQL database that is well-suited for certain common use cases. It offers a flexible document data model that can store semi-structured data like lists and nested objects. MongoDB is designed for high throughput, large data sizes, and low latency. It can scale horizontally by adding more servers to increase reads and writes. Popular use cases for MongoDB include high volume data feeds, operational intelligence, product data management, content management, and user data management where the application needs to store large volumes of data, perform high numbers of read/writes, and benefit from flexible schema. MongoDB may be a good fit if an application has variable object data, requires low latency access, high throughput, storing large numbers of objects,
HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...Cloudera, Inc.
HBase brings interactivity to Hadoop, and allows users to collect, manage and process data in real-time. Lily wraps HBase and Solr in a comprehensive Big Data platform, with HBase-native secondary indexing complementing ad-hoc structured search. Through spare write-cycles during read operations, Lily transforms HBase in an scalable data management engine providing interactive analytics, profile harvesting and real-time recommendations. This talk highlights the architecture of Lily, how it completes HBase, and explains some of its implementation use cases.
HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...Cloudera, Inc.
HBase brings interactivity to Hadoop, and allows users to collect, manage and process data in real-time. Lily wraps HBase and Solr in a comprehensive Big Data platform, with HBase-native secondary indexing complementing ad-hoc structured search. Through spare write-cycles during read operations, Lily transforms HBase in an scalable data management engine providing interactive analytics, profile harvesting and real-time recommendations. This talk highlights the architecture of Lily, how it completes HBase, and explains some of its implementation use cases.
Presented at AI NEXTCon Seattle 1/17-20, 2018
http://aisea18.xnextcon.com
join our free online AI group with 50,000+ tech engineers to learn and practice AI technology, including: latest AI news, tech articles/blogs, tech talks, tutorial videos, and hands-on workshop/codelabs, on machine learning, deep learning, data science, etc..
Azure Cosmos DB is Microsoft’s globally-distributed database service "for managing data at planet-scale" launched in May 2017. It builds upon and extends the earlier Azure DocumentDB, which was released in 2014. It is schema-less and generally classified as a NoSQL database. Please refer git@github.com:NexThoughts/Cosmos-DB.git
Webinar: Faster Big Data Analytics with MongoDBMongoDB
Learn how to leverage MongoDB and Big Data technologies to derive rich business insight and build high performance business intelligence platforms. This presentation includes:
- Uncovering Opportunities with Big Data analytics
- Challenges of real-time data processing
- Best practices for performance optimization
- Real world case study
This presentation was given in partnership with CIGNEX Datamatics.
This ingite length deck talks about why we have seen so much database innovation and the genesis of the NoSQL movement over the last 5 year. While there are many great NoSQL products it speaks to why MongoDB is dominating the space and is the heir apparent to the RDBMS for modern operational data.
MongoDB Evenings DC: Get MEAN and Lean with Docker and KubernetesMongoDB
Get MEAN and Lean with Docker and Kubernetes
Vadim Polyakov, Director of Enterprise Application Architecture, Inovalon
MongoDB Evenings DC
April 12, 2016 at 1776
Webinar: Tales from the Field - 48 Hours to Data Centre RecoveryMongoDB
In this webinar Ger Hartnett, Director of Engineering, Technical Services, talked about what happened when a data centre outage caused chaos and uncovered some significant flaws in a disaster recovery plan. It was late on a Friday evening, 17TB of data was at risk, and there was uncertainty about the reliability of the backups. The Technical Services team had until Monday morning to get everything back to normal.
Presented at AI NEXTCon Seattle 1/17-20, 2018
http://aisea18.xnextcon.com
join our free online AI group with 50,000+ tech engineers to learn and practice AI technology, including: latest AI news, tech articles/blogs, tech talks, tutorial videos, and hands-on workshop/codelabs, on machine learning, deep learning, data science, etc..
Azure Cosmos DB is Microsoft’s globally-distributed database service "for managing data at planet-scale" launched in May 2017. It builds upon and extends the earlier Azure DocumentDB, which was released in 2014. It is schema-less and generally classified as a NoSQL database. Please refer git@github.com:NexThoughts/Cosmos-DB.git
Webinar: Faster Big Data Analytics with MongoDBMongoDB
Learn how to leverage MongoDB and Big Data technologies to derive rich business insight and build high performance business intelligence platforms. This presentation includes:
- Uncovering Opportunities with Big Data analytics
- Challenges of real-time data processing
- Best practices for performance optimization
- Real world case study
This presentation was given in partnership with CIGNEX Datamatics.
This ingite length deck talks about why we have seen so much database innovation and the genesis of the NoSQL movement over the last 5 year. While there are many great NoSQL products it speaks to why MongoDB is dominating the space and is the heir apparent to the RDBMS for modern operational data.
MongoDB Evenings DC: Get MEAN and Lean with Docker and KubernetesMongoDB
Get MEAN and Lean with Docker and Kubernetes
Vadim Polyakov, Director of Enterprise Application Architecture, Inovalon
MongoDB Evenings DC
April 12, 2016 at 1776
Webinar: Tales from the Field - 48 Hours to Data Centre RecoveryMongoDB
In this webinar Ger Hartnett, Director of Engineering, Technical Services, talked about what happened when a data centre outage caused chaos and uncovered some significant flaws in a disaster recovery plan. It was late on a Friday evening, 17TB of data was at risk, and there was uncertainty about the reliability of the backups. The Technical Services team had until Monday morning to get everything back to normal.
Webinaire 6 de la série « Retour aux fondamentaux » : Déploiement en production MongoDB
Il s'agit du dernier webinaire de la série « Retour aux fondamentaux » qui a pour but de vous présenter la base de données MongoDB. Ce webinaire vous guide à travers le déploiement en production.
MongoDB Versatility: Scaling the MapMyFitness PlatformMongoDB
Chris Merz, Manager of Operations, MapMyFitness
The MMF user base more than doubled in 2011, beginning an era of rapid data growth. With Big Data come Big Data Headaches. The traditional MySQL solution for our suite of web applications had hit its ceiling. MongoDB was chosen as the candidate for exploration into NoSQL implementations, and now serves as our go-to data store for rapid application deployment. This talk will detail several of the MongoDB use cases at MMF, from serving 2TB+ of geolocation data, to time-series data for live tracking, to user sessions, app logging, and beyond. Topics will include migration patterns, indexing practices, backend storage choices, and application access patterns, monitoring, and more.
De nouvelles générations de technologies de bases de données permettent aux organisations de créer des applications jusque-là inédites, à une vitesse et une échelle inimaginables auparavant. MongoDB est la base de données qui connaît la croissance la plus rapide au monde. La nouvelle version 3.2 offre les avantages des architectures de bases de données modernes à une gamme toujours plus large d'applications et d'utilisateurs.
Big Data Paris: Etude de Cas: KPMG, l’innovation continue grâce au Data Lake ...MongoDB
Le Big Data, même si le terme est utlisé à outrance, devient une réalité concrète au sein des entreprises. Un exemple pariculèrement parlant avec le Data Lake reposant sur MongoDB conçu par KPMG pour sa suite comptable Loop et son service de benchmark financier pour l’industrie.
MongoDB ne fonctionne pas comme les autres bases de données. Son modèle de données orienté documents, son partitionnement en gammes et sa cohérence forte sont bien adaptés à certains problèmes et moins adaptés à d'autres. Dans ce séminaire Web, nous étudierons des exemples réels d'utilisation de MongoDB mettant à profit ces fonctionnalités uniques. Nous évoquerons le cas de clients spécifiques qui utilisent MongoDB et nous verrons la façon dont ils ont implémenté leur solution. Nous vous montrerons également comment construire une solution du même type pour votre entreprise.
MongoDB has taken a clear lead in adoption among the new generation of databases, including the enormous variety of NoSQL offerings. A key reason for this lead has been a unique combination of agility and scalability. Agility provides business units with a quick start and flexibility to maintain development velocity, despite changing data and requirements. Scalability maintains that flexibility while providing fast, interactive performance as data volume and usage increase. We'll address the key organizational, operational, and engineering considerations to ensure that agility and scalability stay aligned at increasing scale, from small development instances to web-scale applications. We will also survey some key examples of highly-scaled customer applications of MongoDB.
Data Analytics Week at the San Francisco Loft
Using Data Lakes
A data lake can be used as a source for both structured and unstructured data - but how? We'll look at using open standards including Spark and Presto with Amazon EMR, Amazon Redshift Spectrum and Amazon Athena to process and understand data.
Speakers:
John Mallory - Principal Business Development Manager Storage (Object), AWS
Hemant Borole - Sr. Big Data Consultant, AWS
A data lake can be used as a source for both structured and unstructured data - but how? We'll look at using open standards including Spark and Presto with Amazon EMR, Amazon Redshift Spectrum and Amazon Athena to process and understand data.
Level: Intermediate
Speakers:
Tony Nguyen - Senior Consultant, ProServe, AWS
Hannah Marlowe - Consultant - Federal, AWS
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarDataStax Academy
We have seen rapid adoption of C* at eBay in past two years. We have made tremendous efforts to integrate C* into existing database platforms, including Oracle, MySQL, Postgres, MongoDB, XMP etc.. We also scale C* to meet business requirement and encountered technical challenges you only see at eBay scale, 100TB data on hundreds of nodes. We will share our experience of deployment automation, managing, monitoring, reporting for both Apache Cassandra and DataStax enterprise.
AWS January 2016 Webinar Series - Getting Started with Big Data on AWSAmazon Web Services
With hundreds of new and sometimes disparate tools, it’s hard to keep pace. Amazon Web Services provides a broad and fully integrated portfolio of cloud computing services to help you build, secure and deploy your big data applications.
Attend this webinar to get an overview of the different big data options available in the AWS Cloud – including popular big data frameworks such as Hadoop, Spark, NoSQL databases, and more. Learn about ideal use cases, cases to avoid, performance, interfaces, and more. Finally, learn how you can build valuable applications with a real-life example.
Learning Objectives:
Learn about big data tools available at AWS
Understand ideal use cases
Learn some of the key considerations such as performance, scalability, elasticity and availability, when selecting big data tools
Who Should Attend:
Data Architects, Data Scientists, Developers
A data lake can be used as a source for both structured and unstructured data - but how? We'll look at using open standards including Spark and Presto with Amazon EMR, Amazon Redshift Spectrum and Amazon Athena to process and understand data.
Speakers:
Neel Mitra - Solutions Architect, AWS
Roger Dahlstrom - Solutions Architect, AWS
Presentation: Overview of Kognitio, Kognitio Cloud and the Kognitio Analytical Platform
Kognitio is driving the convergence of Big Data, in-memory analytics and cloud computing. Having delivered the first in-memory analytical platform in 1989, it was designed from the ground up to provide the highest amount of scalable compute power to allow rapid execution of complex analytical queries without the administrative overhead of manipulating data. Kognitio software runs on industry-standard x86 servers, or as an appliance, or in Kognitio Cloud, a ready-to-use analytical platform. Kognitio Cloud is a secure, private or public cloud Platform-as-a-Service (PaaS), leveraging the cloud computing model to make the Kognitio Analytical Platform available on a subscription basis. Clients span industries, including market research, consumer packaged goods, retail, telecommunications, financial services, insurance, gaming, media and utilities.
To learn more, visit www.kognitio.com and follow us on Facebook, LinkedIn and Twitter.
SpringPeople - Introduction to Cloud ComputingSpringPeople
Cloud computing is no longer a fad that is going around. It is for real and is perhaps the most talked about subject. Various players in the cloud eco-system have provided a definition that is closely aligned to their sweet spot –let it be infrastructure, platforms or applications.
This presentation will provide an exposure of a variety of cloud computing techniques, architecture, technology options to the participants and in general will familiarize cloud fundamentals in a holistic manner spanning all dimensions such as cost, operations, technology etc
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
During this talk we'll navigate through a customer's journey as they migrate an existing MongoDB deployment to MongoDB Atlas. While the migration itself can be as simple as a few clicks, the prep/post effort requires due diligence to ensure a smooth transfer. We'll cover these steps in detail and provide best practices. In addition, we’ll provide an overview of what to consider when migrating other cloud data stores, traditional databases and MongoDB imitations to MongoDB Atlas.
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
MongoDB Kubernetes operator and MongoDB Open Service Broker are ready for production operations. Learn about how MongoDB can be used with the most popular container orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications. A demo will show you how easy it is to enable MongoDB clusters as an External Service using the Open Service Broker API for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
Humana, like many companies, is tackling the challenge of creating real-time insights from data that is diverse and rapidly changing. This is our journey of how we used MongoDB to combined traditional batch approaches with streaming technologies to provide continues alerting capabilities from real-time data streams.
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
The challenges involved with managing time-series data in IoT applications
Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
Our clients have unique use cases and data patterns that mandate the choice of a particular strategy. To implement these strategies, it is mandatory that we unlearn a lot of relational concepts while designing and rapidly developing efficient applications on NoSQL. In this session, we will talk about some of our client use cases, the strategies we have adopted, and the features of MongoDB that assisted in implementing these strategies.
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
Encryption is not a new concept to MongoDB. Encryption may occur in-transit (with TLS) and at-rest (with the encrypted storage engine). But MongoDB 4.2 introduces support for Client Side Encryption, ensuring the most sensitive data is encrypted before ever leaving the client application. Even full access to your MongoDB servers is not enough to decrypt this data. And better yet, Client Side Encryption can be enabled at the "flick of a switch".
This session covers using Client Side Encryption in your applications. This includes the necessary setup, how to encrypt data without sacrificing queryability, and what trade-offs to expect.
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
MongoDB Kubernetes operator is ready for prime-time. Learn about how MongoDB can be used with most popular orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications.
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
When you need to model data, is your first instinct to start breaking it down into rows and columns? Mine used to be too. When you want to develop apps in a modern, agile way, NoSQL databases can be the best option. Come to this talk to learn how to take advantage of all that NoSQL databases have to offer and discover the benefits of changing your mindset from the legacy, tabular way of modeling data. We’ll compare and contrast the terms and concepts in SQL databases and MongoDB, explain the benefits of using MongoDB compared to SQL databases, and walk through data modeling basics so you feel confident as you begin using MongoDB.
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
Query performance should be the unsung hero of an application, but without proper configuration, can become a constant headache. When used properly, MongoDB provides extremely powerful querying capabilities. In this session, we'll discuss concepts like equality, sort, range, managing query predicates versus sequential predicates, and best practices to building multikey indexes.
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
MongoDB Atlas Data Lake is a new service offered by MongoDB Atlas. Many organizations store long term, archival data in cost-effective storage like S3, GCP, and Azure Blobs. However, many of them do not have robust systems or tools to effectively utilize large amounts of data to inform decision making. MongoDB Atlas Data Lake is a service allowing organizations to analyze their long-term data to discover a wealth of information about their business.
This session will take a deep dive into the features that are currently available in MongoDB Atlas Data Lake and how they are implemented. In addition, we'll discuss future plans and opportunities and offer ample Q&A time with the engineers on the project.
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
Virtual assistants are becoming the new norm when it comes to daily life, with Amazon’s Alexa being the leader in the space. As a developer, not only do you need to make web and mobile compliant applications, but you need to be able to support virtual assistants like Alexa. However, the process isn’t quite the same between the platforms.
How do you handle requests? Where do you store your data and work with it to create meaningful responses with little delay? How much of your code needs to change between platforms?
In this session we’ll see how to design and develop applications known as Skills for Amazon Alexa powered devices using the Go programming language and MongoDB.
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
aux Core Data, appréciée par des centaines de milliers de développeurs. Apprenez ce qui rend Realm spécial et comment il peut être utilisé pour créer de meilleures applications plus rapidement.
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
Il n’a jamais été aussi facile de commander en ligne et de se faire livrer en moins de 48h très souvent gratuitement. Cette simplicité d’usage cache un marché complexe de plus de 8000 milliards de $.
La data est bien connu du monde de la Supply Chain (itinéraires, informations sur les marchandises, douanes,…), mais la valeur de ces données opérationnelles reste peu exploitée. En alliant expertise métier et Data Science, Upply redéfinit les fondamentaux de la Supply Chain en proposant à chacun des acteurs de surmonter la volatilité et l’inefficacité du marché.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
2. Emerging NoSQL Space
RDBMS RDBMS
RDBMS
Data Data
Warehou Warehou NoSQL
se se
The beginning Last 10 years Today
3. Qualities of NoSQL
Workloads
Flexible data models High Throughput Large Data Sizes
• Lists, Nested Objects • Lots of reads • Aggregate data size
• Sparse schemas • Lots of writes • Number of objects
• Semi-structured data
• Agile Development
Low Latency Cloud Computing Commodity
• Both reads and writes • Run anywhere Hardware
• Millisecond latency • No assumptions about • Ethernet
hardware • Local disks
• No / Few Knobs
4. MongoDB was designed for
this
Flexible data models High Throughput Large Data Sizes
• Lists, Nested Objects • Lots of reads • Aggregate data size
• schemas
• SparseJSON based • writes
• Lots of Replica Sets to • Number of objects shards
• 1000’s of
• Semi-structuredmodel
object data scale reads in a single DB
• Dynamic
• Agile Development • Sharding to • Partitioning of
schemas scale writes data
Low Latency Cloud Computing Commodity
• Both reads and writes • Run anywhere Hardware
• In-memory
• Millisecond latency • Scale-out to
• No assumptions about • Ethernet
• Designed for
cache overcome
hardware • Local disks
• No / Few Knobs “typical” OS and
• Scale-out hardware
local file system
working set limitations
7. High Volume Data Feeds
Machine • More machines, more sensors, more
Generated data
Data • Variably structured
Stock Market • High frequency trading
Data
Social Media • Multiple sources of data
Firehose • Each changes their format constantly
8. High Volume Data Feed
Flexible document
model can adapt to
changes in sensor
format
Asynchronous writes
Data
Data
Sources
Data
Sources
Data Write to memory with
Sources periodic disk flush
Sources
Scale writes over
multiple shards
9. Operational Intelligence
• Large volume of state about users
Ad Targeting • Very strict latency requirements
• Expose report data to millions of customers
Real time • Report on large volumes of data
dashboards • Reports that update in real time
Social Media • What are people talking about?
Monitoring
10. Operational Intelligence
Parallelize queries
Low latency reads
across replicas and
shards
API
In database
aggregation
Dashboards
Flexible schema
adapts to changing
input data
Can use same cluster
to collect, store, and
report on data
11. Behavioral Profiles
Rich profiles
collecting multiple
complex actions
1 See Ad
Scale out to support { cookie_id: ‚1234512413243‛,
high throughput of advertiser:{
apple: {
activities tracked actions: [
2 See Ad { impression: ‘ad1’, time: 123 },
{ impression: ‘ad2’, time: 232 },
{ click: ‘ad2’, time: 235 },
{ add_to_cart: ‘laptop’,
sku: ‘asdf23f’,
time: 254 },
Click { purchase: ‘laptop’, time: 354 }
3 ]
}
}
}
Dynamic schemas
make it easy to track
Indexing and
4 Convert vendor specific
querying to support
attributes
matching, frequency
capping
12. Product Data Management
E-Commerce
• Diverse product portfolio
Product • Complex querying and filtering
Catalog
• Scale for short bursts of high-
volume traffic
Flash Sales • Scalable but consistent view of
inventory
13. Meta data
Indexing and rich
query API for easy
searching and sorting
db.archives.
find({ ‚country”: ‚Egypt‛ });
Flexible data model
for similar, but
different objects
{ type: ‚Artefact‛, { ISBN: ‚00e8da9b‛,
medium: ‚Ceramic‛, type: ‚Book‛,
country: ‚Egypt‛, country: ‚Egypt‛,
year: ‚3000 BC‛ title: ‚Ancient Egypt‛
} }
14. Content Management
• Comments and user generated
News Site content
• Personalization of content, layout
Multi-Device • Generate layout on the fly for each
rendering device that connects
• No need to cache static pages
• Store large objects
Sharing • Simple modeling of metadata
15. Content Management
Geo spatial indexing
Flexible data model for location based
GridFS for large
for similar, but searches
object storage
different objects
{ camera: ‚Nikon d4‛,
location: [ -122.418333, 37.775 ]
}
{ camera: ‚Canon 5d mkII‛,
people: [ ‚Jim‛, ‚Carol‛ ],
taken_on: ISODate("2012-03-07T18:32:35.002Z")
}
{ origin: ‚facebook.com/photos/xwdf23fsdf‛,
license: ‚Creative Commons CC0‛,
size: {
dimensions: [ 124, 52 ],
units: ‚pixels‛
Horizontal scalability }
for large data sets }
16. User Data Management
Video • User state and session
management
Games
Social • Scale out to large graphs
• Easy to search and process
Graphs
Identity • Authentication, Authorization, and
Management Accounting
18. Good fits for MongoDB
Application Characteristic Why MongoDB might be a good fit
Variable data in objects Dynamic schema and JSON data model enable
flexible data storage without sparse tables or
complex joins
Low Latency Access Memory Mapped storage engine caches
documents in RAM, enabling in-memory
performance. Data locality of documents can
significantly improve latency over join based
approaches
High write or read throughput Sharding + Replication lets you scale read and
write traffic across multiple servers
Large number of objects to Sharding lets you split objects across multiple
store servers
Cloud based deployment Sharding and replication let you work around
hardware limitations in clouds.
In the beginning, there was RDBMS, and if you needed to store data, that was what you used. There really wasn’t much other choice, other than building a custom database solution for your problem.But RDBMS is performance critical, and BI workloads tended to suck up system resources because of long-running queries that would access all the values in a table. So we carved off the data warehouse as a place to store a copy of the operational data for use in analytical queries. This offloaded work from the RDBMS and bought us cycles to scale higher.Today, we’re seeing another split. There’s a new set of workloads that don’t fit well into an RDBMS. They might not fit in for data modeling constraints, for example things like loosely structured data that are difficult to model in a relational database. Or they might have different workload requirements based around working set, required performance, and size and scale of the data. These are being carved off into yet another tier of our data architecture: the NoSQL store. We don’t think MongoDB is going to completely replace the database architectures we see today, but just as we saw a split between OLTP and OLAP databases in the past, we think that most companies will have 3 different datastores in our enterprise datacenter.We’ll dig into what those workloads are that are driving people to nosqldatastores.
These are some of the qualities of workloads that necessitate a move to NoSQL. Each of these qualities is difficult to achieve in an RDBMS, but is well addressed by NoSQL data stores. One aspect is the data model:As our code frameworks get more sophisticated, the entities we store in our applications get more complex. The number of actual fields and values we try to store within an object continues to increase. Sometimes that data is well structured but we’re seeing an increase in semi-structured and unstructured data; models where every object might have a variable set of fields and these are difficult to model in a relational database.If you’re adopting an agile development methodology and constantly pushing out new releases, this can be painful in a relational database because you have to change your schema in the database every time you release a new version. This might work fine in development but when pushing to production adding a column can be a really slow running operation if you have a database with a TB or more of data.Throughput aspect:Traditionally we could scale a RDBMS by adding replicas to give us lots of read capacity. This works well if you have a relatively small number of updates to your site but you need to high read volume.However, if you need high write scalability then an RDBMS isn’t going to cut it. For example, imagine an application that is consuming stock tick data, that is a very high volume of data that you need to write to your database and traditional replication doesn’t give you write throughput. You need a datastore that can scale the write capacity beyond what a single server can handle. The old way of scaling up a single server to have a huge amount of cores and RAM doesn’t work; it’s extremely expensive and will still hit some limit.You could invest a lot of time in building your own sharding or data distribution architecture (like what google did) but MongoDB does this for you out of the box.Data size is another aspect of this“Big Data” now means that people are interested in storing and processing Terabytes, Petabytes, and even Exabytes of data which would be very hard to do in a relational worldAlso, the total number of objects being stored is going up. Imagine a social media application with hundreds of millions of users constantly updating their status; that means you have billions of objects living in your database and you need something that can handle that volume of data.Things like Hadoop can help deal with this for offline batch processing workloads but increasingly people are wanting that data to be available online as well. These aren’t static files that you want to process later, these are objects that you want to be able to modify and query in realtime.Latency is incredibly important if you’re trying to deliver a great user experience. Maintaining low latency for large datasets becomes difficult in a relational model if you are doing lots of joins, so rethinking the problem and adopting a new model usually becomes necessary.Cloud deployments are becoming the defacto deployment architecture for many applications today.Startups building a new application who don’t want to invest in a hardware infrastructureEnterprise application group building a new application who doesn’t want the long lead time of internal IT procurement cyclesCloud computing architectures don’t give you the same kind of capabilities to scale relational databases as when you’re running in your own datacenter. Scaling an oracle database might involve buying a 128 core server and upgrading your SAN infrastructure with Infiniband or FC interconnects. Scaling up to a VM with 128 cores and a terabyte of RAM will be difficult (if not impossible) and very expensive in a cloud hosting environment. Instead you need an infrastructure that can scale out horizontally so you can combine the compute capacity of multiple virtual machines.Since you’re scaling out sideways and can have many servers in your nosql environment, you want to have fewer configuration parameters and fewer knobs to tweak to reduce the configuration complexity.The move to commodity hardware also drives a lot of cost savings. For example, moving away from expensive and complicated SAN infrastructures to using local disks instead.
MongoDB was designed with these things in mind. JSON allows us to easily start adding new objects to the system that have different types of data and you don’t’ need to do a schema migration. You can store things that look more like objects than rows in a table.MongoDB allows you to scale out to handle not just increased read load but also to handle increased write load. And since we’re scaling out, we can always add more shards to be able to handle a constantly increasing amount of data.Many old databases are architected around a model where everything lives on disk, and then you need a second caching layer on top of them to make them fast. MongoDB was designed to be a very fast database because it heavily uses the RAM in our system and only go to disk when it really needs to.MongoDB was designed with the scale out architecture approach from the beginning. We assume that we’re going to be running on commodity hardware with slower disk drives and can perform very well in this environment.
5 major areas where people are doing interesting things withMongoDB:Content management exploits the flexible schema nature of MongoDB. For example, a web application that is trying to serve up audio, video, images, text, social media user-generated content, you can store all those objects in MongoDB and not worry that every object needs the same fields. If you want to add a twitter stream you can just add a collection of tweets to your database and you don’t need to go back and modify that schema to make it work.Operational intelligence is a variation of business intelligence. We think that most BI workloads still live in the data warehouse platforms but there are a number of applications today that are characterized by realtime access to data. For example, social media sentiment analysis where you are trying to consume various social media data in realtime and figure out what is going on in the social media universe for what people are saying about your brand. Also, realtime analytics is usually not possible in a traditional BI environment where you need to go through an ETL process to bulkload the data into your data warehouse and then use a reporting or dashboard infrastructure to analyze the data after the fact. If you’re trying to do something customer facing where you are exposing those dashboards not just to a handful of analysts but to thousands of customers or even better via an API back into your operational systems, you need something like MongoDB.Product Data Management is typically storing things like product catalogs. If you’re building a site like amazon where you sell books, shoes, movies, and televisions it is difficult to come up with a standard schema for all of these different products so you usually would have different tables for each type of product. We also see the emergence of flash-sale sites where you have very spikey traffic depending on when the sale occurs. It isn’t enough just to replication-scale for handling millions of reads, because you have a limited amount of inventory on a flash-sale you need to make sure you have a strictly consistent view of your inventory so that you don’t oversell the product.User Data Management comes in a lot of forms. Typically we think about this like a directory in an enterprise handling user names, passwords, etc. But now we have apps like foursquare where users are checking in at locations so this goes beyond just typical static user information and covers status updates and information about our activities that are constantly changing. Also big in online gaming; imagine a facebook game where millions of people are building a virtual world and if your game takes a second or two to save every change you make that is a terrible user experience. You need to scale out to handle a large number of aggregate users and a very large working set to access this data in low latency.High Volume Data Feeds comes online with click tracking and analytics, or offline with people building financial apps like high-frequency trading applications trying to keep up with low latency access to quotes coming out of the stock market or social media applications trying to ingest the twitter firehose and cope with that volume of data. This is all about dealing with write scaling and coping with increased write load.
Let’s look at some examples of applications and architectures that people are actually using in these areas.
Machine Generated Data:For example, hundreds or thousands of webservers generating log dataAn energy company who has thousands of temperature sensors or electricity flow sensors from all your customers’ houses that are generating lots of sensor dataIn many cases it isn’t good enough to just store this data to a file because you want realtime access to it for analysis as its being generatedStock Market Data:If you’re a financial house and are trying to ingest L1 or L2 quote feeds this is a massive volume of data. If you want to do complex event processing like storing the last hour of data, without a nosql database like MongoDB you are probably building a custom system so that your High Frequency trading applications can access that amount of data quickly.Social Media Firehose:If you’re connecting to the twitter firehose and you want to actually store and use all of the tweets that are being generated across twitter that alone is a massive write load to be able to handle. Of course you’re also going to extend your infrastructure to handle facebook updates, google plus updates, and any other social media that comes along later you’ll need to be able to scale our your write load even more. Plus each of these have different data formats and different formats of metadata and you want to be able to query on fields that aren’t in every format to make use of each source’s specialized nature.
We architect a system that utilizes the scaleout architecture of mongoDB. We use sharding, which allows you to:Aggregate the disk drive performance of all of those serversAggregate the RAM of all of those servers to be able to stage writes and buffer the incoming flow of documents being written. MongoDB’s storage engine works off of a periodic flush so all of that data is being written to RAM and periodically being written to disk. This allows the OS to structure those I/Os in a much more efficient manner and to work within the capabilities of our disk drives. Disk drives tend to be pretty bad at random I/O as there is a high latency to seek to a particular place on disk and do a single write. By grouping these writes together, it means that when we go to disk to write, we’re seeking to a location and we’re writing out a few hundred MBs of data at once. The bandwidth of writing out a few hundred MBs of data even on cheap drives is pretty good, it’s the seek time that will kill performance so we avoid that in MongoDB as much as possible.MongoDB supports asynchronous writes. In a traditional database you often need to wait for confirmation from the database that a write succeeded before continuing on in your application. You can do this synchronous behavior of acknowledged writes in MongoDB but you can also choose to not wait for acknowledgement. Yes it does mean you could lose some data but in a high volume data feed losing an occasional entry often isn’t so bad. For example, if you’re collecting data from thousands of web servers and miss one line from an access log, it might not affect your application in a statistically significant way. This is a good alternatives depending on the value of your data and whether or not you absolutely need to guarantee every write, which can be much more expensive especially in a high-volume feed. There are a lot of people building out this kind of architecture.
High volume data feeds tend to go hand in hand with Operational Intelligence workloads. If you don’t need to use the data in realtime it may be cheaper to just write that data out to flat files for later processing but if you want to make use of that high volume of data in real time then MongoDB can be excellent for this.For example, if you’re building an ad network and you want to build an ad targeting system, you need to keep track of behavior profies for hundreds of millions of users so every time they visit a page you’re updating a record for what you’ve seen them do. But you also have a very low latency when you’re on an ad exchange where you have 100ms to receive a request for an ad, figure out who the user is and how much they’re worth to you, and then submit a bid for the ad. You need a database that can deal with this amount of volume and respond in just a few millseconds.If you’re building an analytics application and are trying to expose a dashboard to your customers, that means you have a lot of clients that are running range queries over a huge amount of data. Traditional BI systems are architected around a use case of a handful of people accessing dashboards. In modern web apps that gets exposed to lots of users in real time; they don’t want to just generate a daily report and mail it to 10 people, they want to watch the charts change in real time.In Social Media monitoring you’re trying to understand what people are saying about your brand in real time. This is a very large dataset that you’re sifting through with relatively complex queries.
In Operational Intelligence these are usually dashboards or APIs for viewing/monitoring the data in realtime.Using replication we can scale out more read copies of the data natively allowing us to do more of these complex analytical queries in realtime. Since MongoDB is keeping your working set in RAM it tends to be able to do those queries very quickly. As your browser is polling for updates that data from your working set will be in RAM.MongoDB does have a very complex query language for doing these queries, including our new aggregation framework and a built in MapReduce framework for answering queries or generating new collections. Because the MapReduce job can take as input the output of a query we don’t need to do full table scans for every MapReduce job so this makes it suitable to use for cases where we want to do realtime aggregation.The new aggregation framework that was just released in 2.2 gives us the kinds of capabilities we’re used to in SQL engines where we do GROUP BY, SUM, MIN, MAX, HAVING clauses. This is coded in C++ and runs very very fast.If your dataset changes frequently, managing schema migrations in a column oriented database or a data warehouse that has a petabyte of data with a star schema can be a nightmare. Here we can take advantage of MongoDB’s flexible schema as well.
We talked about behavioral profiles and ad targeting. Here’s a case where we’re trying to keep a complex model about what a user is doing. So we want to track that a user had an impression for a specific ad, and then 3 days later they might see a different ad from the same advertiser and you’re going to track another impression and that time they actually click on the ad. Maybe they go to the site and fill out a shopping cart but leave before checking out. A lot of advertisers will want to know what was in your shopping cart to show you another ad later to incentivize you to come back later and actually purchase those items.Here you’re storing a lot of rich data about a particular user so you can make use of it in real time by looking up that user’s cookie and seeing which ads they’ve responded well to in the past to make a decision on how you want to bid by seeing which ads they’ve ignored in the past or what items they had in their shopping cart the last time they were on the site.MongoDB makes it easy to store a complex data model and since all of these fields are stored contiguously on disk you’re not doing expensive joins to pull this data together for a specific user so this will be very high performance. MongoDB allows you to build indexes on all of this data as well to make your queries even faster.
We see this in things like product catalogs where we’re storing a diverse portfolio of product data. Some attributes might be shared across all products like they might all have SKUs and a price, but a book might have an author whereas an MP3 might have an artist or band name. So you can send a query that says “find me all the entries where the author field is Walter Isaacson” and even though maybe only 5% of your database entries have an author field MongoDB can cope with this just fine.That way when your business wants to start selling some new product you don’t need to modify your database to handle new product information like shoe sizes that was never built into your schema.Flash sale sites take advantage of what we’ve seen in the previous two examples where we’re coping with very high load.
Moving to the broader aspect of content management.Most of the news organizations are moving online, and it isn’t enough just to put the news article online but they also want to deal with user-generated data. They also want to personalize the content and layout for what specific users want to see to make the site more useful to them based on their past browsing history.You also want your content to be specially rendered for computers, iPads, iPhone, android devices, and set top boxes so that it looks good on all devices and you want to do this rendering as the requests come in.You might want to upload large objects like photos or videos and share it with other people. This often involves complex metadata modeling where the different datatypes have different attributes from each other.
Similar to the aspects of the Product Catalog, people are also building content management systems in MongoDB. GridFS is functionality built on top of MongoDB that can break up a large file into 16MB documents and store in pieces across your cluster and then when somebody goes and fetches that video or picture, you can reconstruct it using essentially a parallel query that will reassemble all the parts into the original file. This makes it very simple to store large objects in MongoDB as well as the metadata inside your database. Then you don’t need to deal with a separate filesystem to store your images or videos.You might want to allow people to store arbitrary tags about the files in their metadata as well, for example letting people tag who is in a photo. So the image metadata can be both dynamic and user-generated. And as you add more features to the site, you’ll need to add even more metadata tags into your schema. This works really well in MongoDB.
We see User Data Management for things like video games, especially online games where you need to maintain a user state for things like a social game. Maybe you need to keep track of a farm where you plant things and want it to look the same when you come back a week later. Or you’re playing 10 scrabble games against different people simultaneously and we need to track the state of the game as it evolves. People commonly store a single document that stores the entire state of a game. Because we have a sharded environment we can make sure we have enough RAM to store the working set for millions of concurrent users that are playing a game currently.When facebook started they had to build their own custom database for storing the social graphs of millions of people who are constantly adding and removing connections or friends. This works for them and they have a large engineering team just maintaining their custom technology. But if you use MongoDBsharding is built in so you no longer need to maintain that kind of technology yourself. These social graphs tend to models having nodes and edges. But in MongoDB you can easily store a document for every user that simply has a list of all their friends so this makes it very easy to query a user object and get a list of all their friends because the data is co-located on disk so these lookups will be very fast. You can also index on array fields in MongoDB so you can easily run a query to find all the people who have me in their friends list.In a telco environment you may want to manage the authentication and authorization information for millions of users and you might have a large database of every cell phone you’ve issued and every time a cell phone attaches to your network you need to lookup that cell phone’s dial plan, what network they belong to, etc.Or if you’re building a web application or enterprise application you might be buildling a single-sign on facility for millions of users.
If you’re looking at an application and trying to figure out if something is a good fit for MongoDB, here are some characteristics where we see MongoDB really excelling.If you’re faced with an application where you have lots of similar but slightly different objects where there is variation in what data you keep about each object, MongoDB will be a great fit because you’ll never be in a situation where you’re doing a schema migration.Many databases will consider a slow query to be anything that takes longer than a second but in MongoDB we consider a slow query to be anything that takes longer than 100ms. MongoDB was designed for very low latency access to data.High write throughput allows you to scale to tens or hundreds of thousands or even millions of writes per second because you can achieve linearly write scaling via sharding.If you’re looking to deploy in the cloud, MongoDB is a great fit because you can scale out horizontally to cobble together small cloud VM instances into a very large database instance.
When to use RDBMS and when to use MongoDB?Use an RDBMS when you absolutely must have joins that can’t be easily replaced by a de-normalized nested data modelUse an RDBMS when you absolutely need multi-statement transactionsAre there standard CMS available backed by MongoDB?Mongo Press which is essentially a wordpress clone that is based on MongoDB.There is work on Drupal to have it be backed by MongoDB.Usually people are building custom CMS backed by MongoDB