From Oracle to MongoDB, a real use case at Telefónica R&D
The talk will cover the use case of the Personalisation Server, a master customers profile storage for the companies of the Telefonica Group (Telefonica, O2…). It provides real-time (ReST API) and batch interfaces to update, retrieve and share customers profile. Initially the PS used Oracle, but due to scalability and cost issues we implemented a new version with MongoDB.
In the talk we will see the problems that made us move to MongoDB and all the benefits that we obtained (with real performance figures, ofc).
Right now the Oracle version is being used at UK and Ireland (aprox.
30M user profiles stored) and the NoSQL version is being deployed at Mexico (18M customers) and other Latam countries.
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to MarketMongoDB
Telefonica Digital built a personalization server using Oracle 11g to store profile data for millions of customers, but faced performance issues. They rebuilt the server in 4 months using MongoDB and a smaller team. This led to a performance boost of an order of magnitude, predictable scaling, and lower time to market. It also opened opportunities to use MongoDB for other products and services.
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a TimeMongoDB
Let’s face it – the consumer is in control. Retailers, this means – you need to be constantly prepared to listen, speak relevantly and act personally. To meet modern demands and expanding selling channels, retailers need to deploy seamless product information with endless aisle, empowered associates turned sales agents – whenever, to whatever medium they want, however the customer wants.
Knowing today’s realities, most databases systems are rigid and difficult to change, making it a challenge to provide personalized information to customers, wherever they want - right now.
MongoDB is an agile, game-changing technology that provides a real-time view of business with based upon consumer requirements. In this webinar you will learn how leading global retailers create unique business value using MongoDB such as:
1. Real-time view of product information
2. Relevant view of the customer from whichever channel they engage
3. Smart mobile applications that understands the customer's most recent activities
Once in place, retailers continue to leverage the data views to extend their business information across other business areas.
Learn about retailers embracing this approach to meet today’s business needs with MongoDB. As part of a mini-series, led by Rebecca Bucnis, global business architect @MongoDB, we will share how you can get started on your way to Omni-Channel retailing, one step at a time.
The document summarizes the advantages of a native graph database like Neo4j over non-native graph approaches. It discusses how:
1) Non-native graph databases require denormalization and cannot fully enforce graph integrity since the database was not designed for graphs.
2) Neo4j is a native graph database where the engine, data structures, and query language are purpose-built for graphs, allowing it to achieve better performance, scale, and enforcement of graph constraints compared to non-native solutions.
3) Benchmarks show Neo4j can handle workloads with trillions of relationships, millions of writes per second, and outperform non-native databases on common graph queries by over
Relational databases are being pushed beyond their limits because of the way we build and run applications today, coupled with growth in data sources and user loads. To address these challenges, many companies, such as MTV and Cisco have migrated successfully from relational databases to MongoDB.
MongoDB: How We Did It – Reanimating Identity at AOLMongoDB
AOL experienced explosive growth and needed a new database that was both flexible and easy to deploy with little effort. They chose MongoDB. Due to the complexity of internal systems and the data, most of the migration process was spent building a new identity platform and adapters for legacy apps to talk to MongoDB. Systems were migrated in 4 phases to ensure that users were not impacted during the switch. Turning on dual reads/writes to both legacy databases and MongoDB also helped get production traffic into MongoDB during the process. Ultimately, the project was successful with the help of MongoDB support. Today, the team has 15 shards, with 60-70 GB per shard.
Building a Scalable and Modern Infrastructure at CARFAXMongoDB
The document discusses CARFAX's transition from a proprietary key-value store to MongoDB. It describes CARFAX's production MongoDB environment including 12 sharded servers with 128GB RAM each. It details how CARFAX loads millions to billions of records per day via distributed processing. It also discusses implementing high availability reads through tagging data centers and replica sets to scale to millions of reports per day. The presentation emphasizes automating processes and sharding early and often for scaling their MongoDB deployment.
Speaker: Eliane Kabkab, Senior Product Designer, MongoDB
Track: WTC Lounge
Working on the product design of MongoDB Cloud requires an interesting marriage of technical and design process knowledge. This talk will walk you through the stages of designing for Cloud Manager, Ops Manager and Atlas. Drawing examples from the Cloud “Organizations” project (or the redefinition of our Cloud “groups”), we will discuss the identification of existing customer patterns and pain points, and how they lead into a set of concepts and solutions. We will also address validation, prioritization and the back and forth process that results in the development and release of our Cloud features.
Speaker: Igor Pavleković;
Products included in Office 365 are due to be released to the market with new version number. In this session participants will be introduced into new features and highlights included in new versions of Exchange, Lync and Sharepoint bundled into Microsoft’s cloud environment named Office 365.
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to MarketMongoDB
Telefonica Digital built a personalization server using Oracle 11g to store profile data for millions of customers, but faced performance issues. They rebuilt the server in 4 months using MongoDB and a smaller team. This led to a performance boost of an order of magnitude, predictable scaling, and lower time to market. It also opened opportunities to use MongoDB for other products and services.
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a TimeMongoDB
Let’s face it – the consumer is in control. Retailers, this means – you need to be constantly prepared to listen, speak relevantly and act personally. To meet modern demands and expanding selling channels, retailers need to deploy seamless product information with endless aisle, empowered associates turned sales agents – whenever, to whatever medium they want, however the customer wants.
Knowing today’s realities, most databases systems are rigid and difficult to change, making it a challenge to provide personalized information to customers, wherever they want - right now.
MongoDB is an agile, game-changing technology that provides a real-time view of business with based upon consumer requirements. In this webinar you will learn how leading global retailers create unique business value using MongoDB such as:
1. Real-time view of product information
2. Relevant view of the customer from whichever channel they engage
3. Smart mobile applications that understands the customer's most recent activities
Once in place, retailers continue to leverage the data views to extend their business information across other business areas.
Learn about retailers embracing this approach to meet today’s business needs with MongoDB. As part of a mini-series, led by Rebecca Bucnis, global business architect @MongoDB, we will share how you can get started on your way to Omni-Channel retailing, one step at a time.
The document summarizes the advantages of a native graph database like Neo4j over non-native graph approaches. It discusses how:
1) Non-native graph databases require denormalization and cannot fully enforce graph integrity since the database was not designed for graphs.
2) Neo4j is a native graph database where the engine, data structures, and query language are purpose-built for graphs, allowing it to achieve better performance, scale, and enforcement of graph constraints compared to non-native solutions.
3) Benchmarks show Neo4j can handle workloads with trillions of relationships, millions of writes per second, and outperform non-native databases on common graph queries by over
Relational databases are being pushed beyond their limits because of the way we build and run applications today, coupled with growth in data sources and user loads. To address these challenges, many companies, such as MTV and Cisco have migrated successfully from relational databases to MongoDB.
MongoDB: How We Did It – Reanimating Identity at AOLMongoDB
AOL experienced explosive growth and needed a new database that was both flexible and easy to deploy with little effort. They chose MongoDB. Due to the complexity of internal systems and the data, most of the migration process was spent building a new identity platform and adapters for legacy apps to talk to MongoDB. Systems were migrated in 4 phases to ensure that users were not impacted during the switch. Turning on dual reads/writes to both legacy databases and MongoDB also helped get production traffic into MongoDB during the process. Ultimately, the project was successful with the help of MongoDB support. Today, the team has 15 shards, with 60-70 GB per shard.
Building a Scalable and Modern Infrastructure at CARFAXMongoDB
The document discusses CARFAX's transition from a proprietary key-value store to MongoDB. It describes CARFAX's production MongoDB environment including 12 sharded servers with 128GB RAM each. It details how CARFAX loads millions to billions of records per day via distributed processing. It also discusses implementing high availability reads through tagging data centers and replica sets to scale to millions of reports per day. The presentation emphasizes automating processes and sharding early and often for scaling their MongoDB deployment.
Speaker: Eliane Kabkab, Senior Product Designer, MongoDB
Track: WTC Lounge
Working on the product design of MongoDB Cloud requires an interesting marriage of technical and design process knowledge. This talk will walk you through the stages of designing for Cloud Manager, Ops Manager and Atlas. Drawing examples from the Cloud “Organizations” project (or the redefinition of our Cloud “groups”), we will discuss the identification of existing customer patterns and pain points, and how they lead into a set of concepts and solutions. We will also address validation, prioritization and the back and forth process that results in the development and release of our Cloud features.
Speaker: Igor Pavleković;
Products included in Office 365 are due to be released to the market with new version number. In this session participants will be introduced into new features and highlights included in new versions of Exchange, Lync and Sharepoint bundled into Microsoft’s cloud environment named Office 365.
eHarmony - Messaging Platform with MongoDB Atlas MongoDB
eHarmony is moving their messaging platform to MongoDB Atlas to improve performance and scalability. They are redesigning their 18 step communication flow into a simpler real-time chat system. This will require restructuring their relational database tables into a flexible NoSQL schema in MongoDB Atlas. They modeled the data as collections for conversations, chat history, and recently asked questions. MongoDB Atlas provides high availability, automatic scaling, and worry-free management. Load testing showed performance and latency improvements over their on-premise solution. Monitoring tools in Atlas will provide visibility into key metrics like response times, storage usage, and traffic volumes to support over 300 million users.
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...MongoDB
This session will be a case study of eBay’s experience running MongoDB for project Zoom, in which eBay stores all media metadata for the site. This includes references to pictures of every item for sale on eBay. This cluster is eBay's first MongoDB installation on the platform and is a mission critical application. Yuri Finkelstein, an Enterprise Architect on the team, will provide a technical overview of the project and its underlying architecture.
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Lucidworks
This document summarizes S&P Global's use of Solr for search capabilities across their large datasets. It discusses how S&P Global indexes over 50 million documents into Solr monthly and handles over 5 million queries per week. It outlines challenges faced with an on-premise Solr deployment and how migrating to Solr Cloud helped address issues like performance, availability, and scalability. Next steps discussed include improving relevancy through data science, continuing to leverage new Solr features, and exploring ways to integrate machine learning into search capabilities.
The Evolution of Open Source DatabasesIvan Zoratti
The document provides an overview of the evolution of open source databases from the past to present and future. It discusses the early days of navigational and hierarchical databases. It then covers the development of relational databases and SQL. It outlines the rise of open source databases like MySQL, PostgreSQL, and SQLite. It also summarizes the emergence of NoSQL databases and NewSQL systems to handle big data and cloud computing. The document predicts continued development and blending of features between SQL, NoSQL, and NewSQL databases.
This document provides an overview and user guide for Apache NiFi. It discusses what NiFi is, its architecture and data flow, common terms, and how to operate and design data flows in NiFi. The guide explains how to debug NiFi and test data flows. It also provides tips on processor utilization, routing strategies, and using the NiFi Expression Language. Examples of existing NiFi processors and learning steps are outlined.
Enabling Telco to Build and Run Modern Applications Tugdual Grall
This document discusses how MongoDB can help enable businesses to build and run modern applications. It begins with an overview of Tugdual Grall and his background. It then discusses how industries and data have changed, driving the need for a next generation database. The rest of the document provides an overview of MongoDB, including the company, technology, and community. Examples are given of how MongoDB has helped companies in the telecommunications industry achieve a single customer view, improve product catalogs and personalization, and build mobile and open data APIs.
Assuring the code quality of share point solutions and apps - Matthias EinigSPC Adriatics
This document discusses the importance of assuring code quality in SharePoint solutions and apps. It describes various aspects of code quality like reliability, security and maintainability. It also presents tools like FxCop, StyleCop and the SharePoint Code Analysis Framework (SPCAF) that can be used to analyze code quality, ensure compliance with standards, and identify issues. SPCAF allows custom rules to be created to analyze SharePoint-specific code and dependencies. Continuous integration of code analysis helps catch problems early and reduce costs.
Who Moved my State? A Blob Storage Solr Story - Ilan Ginzburg, Salesforce Lucidworks
This document summarizes a presentation about moving Solr cores to a centralized blob storage system to improve high availability and scalability for Salesforce's search implementation. The current architecture has cores distributed across Solr servers, which limits elasticity. The new design stores core data and metadata in a central blob storage, allowing cores to be loaded and indexed from any Solr server. This improves availability, as cores can quickly be loaded elsewhere if a server fails. Initial testing shows cores can be loaded within seconds on another server using this approach. The system is undergoing further testing and refinement before potential adoption for Salesforce search deployments on public clouds.
The talk will describe the results we got by adopting MongoDB in key areas of our business. Backcountry.com is a midsize company in constant evolution looking for growth in a extremely competitive ecosystem. We try to be agile and target fast prototyping and data-based decisions. Our dev stack used to heavily rely on Postgres and Oracle, but in a short period of time we were able to introduce MongoDB in a key set of applications and we've seen positive results. We're less dependent on monolithic applications and we're progressively moving to Microservices. By choosing MongoDB as one of our main technologies, our dev teams became more productive as well as mature. They see beyond relational approaches and explore more options to tackle different problems.
This document discusses migrating from an RDBMS to MongoDB. It begins by introducing the presenter and stating the goal is to explore issues in moving an existing RDBMS system to MongoDB. It then discusses determining the value of migrating, roles and responsibilities, bulk migration techniques, and approaches to cutting over the system. Key points made include understanding why you want to migrate, assessing the effort required versus the potential pain relief, involving all roles including developers and DBAs, using tools like mongoimport for bulk loads, and testing before any production cutover.
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersLucidworks
This document discusses how SQL can be used in Lucidworks Fusion for various purposes like aggregating signals to compute relevance scores, ingesting and transforming data from various sources using Spark SQL, enabling self-service analytics through tools like Tableau and PowerBI, and running experiments to compare variants. It provides examples of using SQL for tasks like sessionization with window functions, joining multiple data sources, hiding complex logic in user-defined functions, and powering recommendations. The document recommends SQL in Fusion for tasks like analytics, data ingestion, machine learning, and experimentation.
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Impetus Technologies
Impetus webcast "Leveraging NoSQL Database Technology to Implement Real-time Data Architectures” available at http://bit.ly/1g6Eaj4
This webcast:
• Presents trade-offs of using different approaches to achieve a real-time architecture
• Closely examines an implementation of a NoSQL based real-time architecture
• Shares specific capabilities offered by NoSQL Databases that enable cost and reliability advantages over other techniques
MongoDB and Our Journey from Old, Slow and Monolithic to Fast and Agile Micro...MongoDB
Jeremiah Ivan, VP of Engineering, Merrill Corporation
In the span of 12 months Merrill was able to move from a monolithic and hard-to-change architecture to a fast-moving, agile development platform, enabled by the MongoDB database. We’ll talk about the technology, people, and process changes involved in the transformation. We hope that participants in this session will come away with the bits and pieces of a recipe for success that they can apply to their environment.
NoSQL Now: Postgres - The NoSQL Cake You Can EatDATAVERSITY
The path to creating a single view of your customers or your transactional systems is overflowing with high costs and complexity. Major vendors have built massive, million-dollar systems that are too expensive and too complicated for most. NoSQL-only solutions seem to have promise, but simply do not necessarily have what you need. Learn what Postgres can do for you that NoSQL-only solutions can't.
Using a NoSQL-only solution and dumping gigabytes of data from multiple disparate systems into gigantic documents is complicated. And it forces tough choices—group all data by customer, by transaction, or by policy? You must choose, and this can be a hard process for some organizations. And almost always, organizations later learn they need relationships among the data, which NoSQL-only solutions cannot support.
Postgres eliminates the complexity and the pain of creating a single view of the customer. With recent advances, Postgres can support semi-structured, unstructured and structured data in the same environment, employing relational qualities and ACID compliance.
During this presentation, Marc Linster, SVP Products & Services, will review: ·
How to do more with Postgres
Open source alternative to RDBMS and more...
The NoSQL Conundrum
Why do developers like NoSQL Only solutions?
Problems and fallacies of NoSQL (only)
Data Standards
Data Islands
NoSQL Data Models include data access paths
Not Only SQL - Technology Innovation on a Robust Platform
Document Store
See JSON Examples
360 Degree view of the customer
Data Integration
What's New in SharePoint 2016 for End Users Webinar with IntlockVlad Catrinescu
SharePoint 2016 RTM is almost out, and with the Beta 2 being 99% feature complete, we already have a good idea of what will be in the final product. In this short webinar, we will look at all the new cool stuff in SharePoint Server 2016 from and End User point of view! SharePoint 2016 includes some awesome features such as DLP, Durable Links as well as Microsoft’s investments in Hybrid!
APNIC Director General Paul Wilson presents on the next generation of Internet number registry services, namely RDAP and RPKI at the 31st TWNIC OPM and TWNOG in Taipei, Taiwan from 27 to 28 November 2018.
SharePoint 2016 Beta 2 What's new (End users and IT Pros) Microsoft Innovat...serge luca
This document summarizes information presented by Serge Luca about SharePoint 2016. It discusses improvements to the user interface to make it more like Office 365, a more cloud-inspired infrastructure with features like OneDrive redirection. It also covers new compliance and reporting features like deletion policies and data loss prevention. The document outlines what is deprecated in SharePoint 2016 like Forefront Identity Manager and supported features. It provides information on hardware/software requirements, boundaries, and the new minimum role topology. Finally, it discusses upgrading to SharePoint 2016 and the roadmap going forward.
This document discusses Adobe API Management and how it can help businesses realize value from their APIs. It outlines how API Management automates access control, versioning, analytics, documentation and other functionality that would otherwise require manual implementation. The document also provides examples of API monetization strategies like transactional payments, subscriptions, marketplaces and partnerships. It highlights the speed, simplicity and scalability of Adobe's API Management platform and demonstrates its request flow and user interface.
Oracle vs NoSQL – The good, the bad and the uglyJohn Kanagaraj
A good understanding of NoSQL database technologies that can be used to support a Big Data implementation is essential for today’s Oracle professional. This was discussed in detail in a 2 hour deep-dive technical session at COLLABORATE 2014 - The Oracle User Group Conference. In this slide deck, you will learn what Big Data brings to the table as well as the concepts behind the underlying NoSQL data stores, in comparison to its ancestor you know well - the Oracle RDBMS. We will determine where and how to employ these NoSQL data stores effectively as well as point out some of the issues that you will have to think through (and prepare for) before your organization rushes headlong into a “Big Data” implementation. We will look specifically at MongoDB, CouchBase and Cassandra in this context. At the end of the session, we will provide pointers and links to help the audience take the next step in learning about these technologies for themselves
Migration from SQL to MongoDB - A Case Study at TheKnot.com MongoDB
8 out of 10 couples use TheKnot.com to help plan their wedding. A key part of planning involves selecting articles, photographs, and other resources and storing these in the user's Favorites. Recently we migrated major parts of our technology stack to open source technologies. As part of our migration strategy, we zeroed in on MongoDB, since it better suited our requirements for speed and data structure as well as eliminating the need for a caching layer. The transition required a period in which both our legacy and new API where working concurrently with data being persisted on both databases (SQL and Mongo) and all records were being synched with every request. We resourced to many strategies and applications to achieve this goal, including: Pentaho, AWS SQS and SNS, a queue messenger system and some proprietary ruby gems. In this session we will review our strategy and some of the lessons we learned about successfully migrating with zero downtime.
eHarmony - Messaging Platform with MongoDB Atlas MongoDB
eHarmony is moving their messaging platform to MongoDB Atlas to improve performance and scalability. They are redesigning their 18 step communication flow into a simpler real-time chat system. This will require restructuring their relational database tables into a flexible NoSQL schema in MongoDB Atlas. They modeled the data as collections for conversations, chat history, and recently asked questions. MongoDB Atlas provides high availability, automatic scaling, and worry-free management. Load testing showed performance and latency improvements over their on-premise solution. Monitoring tools in Atlas will provide visibility into key metrics like response times, storage usage, and traffic volumes to support over 300 million users.
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...MongoDB
This session will be a case study of eBay’s experience running MongoDB for project Zoom, in which eBay stores all media metadata for the site. This includes references to pictures of every item for sale on eBay. This cluster is eBay's first MongoDB installation on the platform and is a mission critical application. Yuri Finkelstein, an Enterprise Architect on the team, will provide a technical overview of the project and its underlying architecture.
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Lucidworks
This document summarizes S&P Global's use of Solr for search capabilities across their large datasets. It discusses how S&P Global indexes over 50 million documents into Solr monthly and handles over 5 million queries per week. It outlines challenges faced with an on-premise Solr deployment and how migrating to Solr Cloud helped address issues like performance, availability, and scalability. Next steps discussed include improving relevancy through data science, continuing to leverage new Solr features, and exploring ways to integrate machine learning into search capabilities.
The Evolution of Open Source DatabasesIvan Zoratti
The document provides an overview of the evolution of open source databases from the past to present and future. It discusses the early days of navigational and hierarchical databases. It then covers the development of relational databases and SQL. It outlines the rise of open source databases like MySQL, PostgreSQL, and SQLite. It also summarizes the emergence of NoSQL databases and NewSQL systems to handle big data and cloud computing. The document predicts continued development and blending of features between SQL, NoSQL, and NewSQL databases.
This document provides an overview and user guide for Apache NiFi. It discusses what NiFi is, its architecture and data flow, common terms, and how to operate and design data flows in NiFi. The guide explains how to debug NiFi and test data flows. It also provides tips on processor utilization, routing strategies, and using the NiFi Expression Language. Examples of existing NiFi processors and learning steps are outlined.
Enabling Telco to Build and Run Modern Applications Tugdual Grall
This document discusses how MongoDB can help enable businesses to build and run modern applications. It begins with an overview of Tugdual Grall and his background. It then discusses how industries and data have changed, driving the need for a next generation database. The rest of the document provides an overview of MongoDB, including the company, technology, and community. Examples are given of how MongoDB has helped companies in the telecommunications industry achieve a single customer view, improve product catalogs and personalization, and build mobile and open data APIs.
Assuring the code quality of share point solutions and apps - Matthias EinigSPC Adriatics
This document discusses the importance of assuring code quality in SharePoint solutions and apps. It describes various aspects of code quality like reliability, security and maintainability. It also presents tools like FxCop, StyleCop and the SharePoint Code Analysis Framework (SPCAF) that can be used to analyze code quality, ensure compliance with standards, and identify issues. SPCAF allows custom rules to be created to analyze SharePoint-specific code and dependencies. Continuous integration of code analysis helps catch problems early and reduce costs.
Who Moved my State? A Blob Storage Solr Story - Ilan Ginzburg, Salesforce Lucidworks
This document summarizes a presentation about moving Solr cores to a centralized blob storage system to improve high availability and scalability for Salesforce's search implementation. The current architecture has cores distributed across Solr servers, which limits elasticity. The new design stores core data and metadata in a central blob storage, allowing cores to be loaded and indexed from any Solr server. This improves availability, as cores can quickly be loaded elsewhere if a server fails. Initial testing shows cores can be loaded within seconds on another server using this approach. The system is undergoing further testing and refinement before potential adoption for Salesforce search deployments on public clouds.
The talk will describe the results we got by adopting MongoDB in key areas of our business. Backcountry.com is a midsize company in constant evolution looking for growth in a extremely competitive ecosystem. We try to be agile and target fast prototyping and data-based decisions. Our dev stack used to heavily rely on Postgres and Oracle, but in a short period of time we were able to introduce MongoDB in a key set of applications and we've seen positive results. We're less dependent on monolithic applications and we're progressively moving to Microservices. By choosing MongoDB as one of our main technologies, our dev teams became more productive as well as mature. They see beyond relational approaches and explore more options to tackle different problems.
This document discusses migrating from an RDBMS to MongoDB. It begins by introducing the presenter and stating the goal is to explore issues in moving an existing RDBMS system to MongoDB. It then discusses determining the value of migrating, roles and responsibilities, bulk migration techniques, and approaches to cutting over the system. Key points made include understanding why you want to migrate, assessing the effort required versus the potential pain relief, involving all roles including developers and DBAs, using tools like mongoimport for bulk loads, and testing before any production cutover.
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersLucidworks
This document discusses how SQL can be used in Lucidworks Fusion for various purposes like aggregating signals to compute relevance scores, ingesting and transforming data from various sources using Spark SQL, enabling self-service analytics through tools like Tableau and PowerBI, and running experiments to compare variants. It provides examples of using SQL for tasks like sessionization with window functions, joining multiple data sources, hiding complex logic in user-defined functions, and powering recommendations. The document recommends SQL in Fusion for tasks like analytics, data ingestion, machine learning, and experimentation.
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Impetus Technologies
Impetus webcast "Leveraging NoSQL Database Technology to Implement Real-time Data Architectures” available at http://bit.ly/1g6Eaj4
This webcast:
• Presents trade-offs of using different approaches to achieve a real-time architecture
• Closely examines an implementation of a NoSQL based real-time architecture
• Shares specific capabilities offered by NoSQL Databases that enable cost and reliability advantages over other techniques
MongoDB and Our Journey from Old, Slow and Monolithic to Fast and Agile Micro...MongoDB
Jeremiah Ivan, VP of Engineering, Merrill Corporation
In the span of 12 months Merrill was able to move from a monolithic and hard-to-change architecture to a fast-moving, agile development platform, enabled by the MongoDB database. We’ll talk about the technology, people, and process changes involved in the transformation. We hope that participants in this session will come away with the bits and pieces of a recipe for success that they can apply to their environment.
NoSQL Now: Postgres - The NoSQL Cake You Can EatDATAVERSITY
The path to creating a single view of your customers or your transactional systems is overflowing with high costs and complexity. Major vendors have built massive, million-dollar systems that are too expensive and too complicated for most. NoSQL-only solutions seem to have promise, but simply do not necessarily have what you need. Learn what Postgres can do for you that NoSQL-only solutions can't.
Using a NoSQL-only solution and dumping gigabytes of data from multiple disparate systems into gigantic documents is complicated. And it forces tough choices—group all data by customer, by transaction, or by policy? You must choose, and this can be a hard process for some organizations. And almost always, organizations later learn they need relationships among the data, which NoSQL-only solutions cannot support.
Postgres eliminates the complexity and the pain of creating a single view of the customer. With recent advances, Postgres can support semi-structured, unstructured and structured data in the same environment, employing relational qualities and ACID compliance.
During this presentation, Marc Linster, SVP Products & Services, will review: ·
How to do more with Postgres
Open source alternative to RDBMS and more...
The NoSQL Conundrum
Why do developers like NoSQL Only solutions?
Problems and fallacies of NoSQL (only)
Data Standards
Data Islands
NoSQL Data Models include data access paths
Not Only SQL - Technology Innovation on a Robust Platform
Document Store
See JSON Examples
360 Degree view of the customer
Data Integration
What's New in SharePoint 2016 for End Users Webinar with IntlockVlad Catrinescu
SharePoint 2016 RTM is almost out, and with the Beta 2 being 99% feature complete, we already have a good idea of what will be in the final product. In this short webinar, we will look at all the new cool stuff in SharePoint Server 2016 from and End User point of view! SharePoint 2016 includes some awesome features such as DLP, Durable Links as well as Microsoft’s investments in Hybrid!
APNIC Director General Paul Wilson presents on the next generation of Internet number registry services, namely RDAP and RPKI at the 31st TWNIC OPM and TWNOG in Taipei, Taiwan from 27 to 28 November 2018.
SharePoint 2016 Beta 2 What's new (End users and IT Pros) Microsoft Innovat...serge luca
This document summarizes information presented by Serge Luca about SharePoint 2016. It discusses improvements to the user interface to make it more like Office 365, a more cloud-inspired infrastructure with features like OneDrive redirection. It also covers new compliance and reporting features like deletion policies and data loss prevention. The document outlines what is deprecated in SharePoint 2016 like Forefront Identity Manager and supported features. It provides information on hardware/software requirements, boundaries, and the new minimum role topology. Finally, it discusses upgrading to SharePoint 2016 and the roadmap going forward.
This document discusses Adobe API Management and how it can help businesses realize value from their APIs. It outlines how API Management automates access control, versioning, analytics, documentation and other functionality that would otherwise require manual implementation. The document also provides examples of API monetization strategies like transactional payments, subscriptions, marketplaces and partnerships. It highlights the speed, simplicity and scalability of Adobe's API Management platform and demonstrates its request flow and user interface.
Oracle vs NoSQL – The good, the bad and the uglyJohn Kanagaraj
A good understanding of NoSQL database technologies that can be used to support a Big Data implementation is essential for today’s Oracle professional. This was discussed in detail in a 2 hour deep-dive technical session at COLLABORATE 2014 - The Oracle User Group Conference. In this slide deck, you will learn what Big Data brings to the table as well as the concepts behind the underlying NoSQL data stores, in comparison to its ancestor you know well - the Oracle RDBMS. We will determine where and how to employ these NoSQL data stores effectively as well as point out some of the issues that you will have to think through (and prepare for) before your organization rushes headlong into a “Big Data” implementation. We will look specifically at MongoDB, CouchBase and Cassandra in this context. At the end of the session, we will provide pointers and links to help the audience take the next step in learning about these technologies for themselves
Migration from SQL to MongoDB - A Case Study at TheKnot.com MongoDB
8 out of 10 couples use TheKnot.com to help plan their wedding. A key part of planning involves selecting articles, photographs, and other resources and storing these in the user's Favorites. Recently we migrated major parts of our technology stack to open source technologies. As part of our migration strategy, we zeroed in on MongoDB, since it better suited our requirements for speed and data structure as well as eliminating the need for a caching layer. The transition required a period in which both our legacy and new API where working concurrently with data being persisted on both databases (SQL and Mongo) and all records were being synched with every request. We resourced to many strategies and applications to achieve this goal, including: Pentaho, AWS SQS and SNS, a queue messenger system and some proprietary ruby gems. In this session we will review our strategy and some of the lessons we learned about successfully migrating with zero downtime.
The document provides an overview of MongoDB and how it can be used practically with Ruby projects. It discusses how MongoDB simplifies schema design by allowing embedded documents that match how objects are structured in code. This avoids the need to map objects to SQL schemas. Examples are given of how MongoDB could be used for a blogging application with embedded comments and tags, for logging with capped collections, and for an accounting application with embedded transaction entries. The document also introduces MongoMapper as an ORM for MongoDB that provides an ActiveRecord-like syntax for modeling documents and relationships in Ruby code.
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015NoSQLmatters
During this live-coding session, Tugdual will move an old fashion full SQL application (JavaEE) to the new NoSQL world.Using MongoDB, and REST, he will show the benefits of this new architecture: * Easyness * Flexibility * High availability * Scalability; During this presentation, you will learn more about: * Document Oriented Model * JSON * REST * Iterative development; This demonstration is also a good opportunity to see how you can migrate data from a relational database, and the various schema options.
Trading up: Adding Flexibility and Scalability to Bouygues Telecom with MongoDBMongoDB
This document summarizes Pierre-Alban DEWITTE's presentation at MongoDB World about how Bouygues Telecom added flexibility and scalability by moving to MongoDB. Some of the key points covered include:
- Existing systems at Bouygues Telecom had issues like inflexible schemas and poor performance.
- MongoDB was chosen over other options like Tomcat for its schema-less design, high availability, and ability to scale.
- Storm was used to process streaming data in real-time and update MongoDB collections.
- The transition involved refactoring schemas, development, testing, and ensuring DevOps processes supported the new system.
- While challenges were faced, MongoDB and open source tools provided the flexibility
This document discusses using MongoDB with Spring Data. It shows how to configure MongoDB drivers and templates as beans in Spring. It also demonstrates how to perform CRUD operations on MongoDB documents using Spring Data's MongoRepository interface, including implicit conversion between Java objects and MongoDB documents. Key features of MongoRepository discussed include query methods, custom queries, and asynchronous support.
Introduction to Big Data and NoSQL.
This presentation was given to the Master DBA course at John Bryce Education in Israel.
Work is based on presentations by Michael Naumov, Baruch Osoveskiy, Bill Graham and Ronen Fidel.
Top 5 Things to Know About Integrating MongoDB into Your Data WarehouseMongoDB
1) The document discusses integrating MongoDB, a NoSQL database, with Teradata, a data warehouse platform.
2) It provides 5 key things to know about the integration, including how Teradata can pull directly from sharded MongoDB clusters and push data back.
3) Use cases are presented where the operational data in MongoDB can provide context and analytics capabilities for applications, and the data warehouse can enrich the operational data.
The document discusses migrating from an RDBMS to MongoDB. It covers determining if a migration is worthwhile based on evaluating current pain points and target value. It also discusses the roles and responsibilities that will change during a migration, including data architects, developers, DBAs and more. Bulk migration techniques are reviewed including using mongoimport to import JSON data. System cutover is also mentioned as an important part of the migration process.
- Mongo DB is an open-source document database that provides high performance, a rich query language, high availability through clustering, and horizontal scalability through sharding. It stores data in BSON format and supports indexes, backups, and replication.
- Mongo DB is best for operational applications using unstructured or semi-structured data that require large scalability and multi-datacenter support. It is not recommended for applications with complex calculations, finance data, or those that scan large data subsets.
- The next session will provide a security and replication overview and include demonstrations of installation, document creation, queries, indexes, backups, and replication and sharding if possible.
The document introduces MongoDB as a scalable, high-performance, open source, schema-free, document-oriented database. It discusses MongoDB's philosophy of flexibility and scalability over relational semantics. The main features covered are document storage, querying, indexing, replication, MapReduce and auto-sharding. Concepts like collections, documents and cursors are mapped to relational database terms. Examples uses include data warehousing and debugging.
This document discusses how VoIP (Voice over Internet Protocol) can benefit small businesses. It outlines the key advantages of VoIP such as lower communication costs, increased productivity, and the ability to access phone systems from anywhere via broadband. The document provides tips for businesses on getting started with VoIP, including figuring out bandwidth needs, using multiple internet providers, selecting a VoIP vendor, and tying VoIP to marketing and sales processes. Lead generation using VoIP is also discussed.
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...Precisely
Today, in the age of big data, data quality is more essential than ever. Whatever the size of your data – you need it to be clean, free of duplicates and ready for use.
View this customer education webinar on-demand where you will learn more about the latest improvements in the market-leading data quality solution – Syncsort’s TSS Enterprise, and how it can help you receive a quicker ROI from your Syncsort Trillium investment.
During this webinar, you will learn more about new TSS Enterprise 15.8 features such as:
• Performance improvements in Syncsort Trillium Discovery
• Syncsort’s Collibra integration for a stronger data governance capability
• Added support for Amazon EMR to Syncsort Trillium Quality for Big Data
• The NEW real-time data quality function
Don’t have TSS? View this webinar on-demand to see what you may be missing by not having market-leading data quality solutions. Whether you need to de-duplicate millions of records on Spark, want to fix data errors in real-time in your CRM or build geo-location and address verification into your web application – we’ve got what you’re looking for!
O365Engage17 - Skype for Business Cloud PBX in the Real WorldNCCOMMS
Skype for Business Cloud PBX provides calling features and PSTN connectivity options for Office 365 users without the need for an on-premises PBX. It offers core features for knowledge workers as well as call queues, auto attendants, and reporting tools for administration in the cloud. Users can connect to PSTN networks either through Microsoft which provides the connectivity, by connecting an existing on-premises Skype for Business deployment, or through a Cloud Connector Edition appliance to bridge the on-premises PBX to the cloud.
Toyota Financial Services Digital Transformation - Think 2019Slobodan Sipcic
Toyota Financial Services (TFS) and IBM partnered to develop Data & Integration Platform (D&IP) to be the hub around which all current and future TFS data sources, services, and processes interact. To that end IBM have architected and deployed a FOAK event-based data stream processing and streaming integration platform. The main components of the architecture include: Kubernetes, Apache NiFi, Apache Kafka, Schema Registry, Jenkins, S3 and MongoDB. The platform is essential for realizing the TFS' strategic data stream processing and integration needs.
Lync online: How the cloud is changing the way we communicatePerficient, Inc.
An in-depth slideshare on Lync Online and Lync Hybrid functionality, requirements and best practices to help you decide if Lync Online is the right fit for your organization. You will learn all about Lync hybrid - from a functionality review and required on-premises infrastructure components, to account migration best practices. You will see:
Feature comparisons of Lync 2013 on-premises vs. Lync Online
Voice, conferencing and ACP integration considerations
Information about the option to deploy a hybrid Lync environment
Credit goes to Christopher B Ferris @christo4ferris who put together this presentation which covers the latest developments of Hyperledger Fabric made available in Fabric 1.1 and 1.2 and what can be expected next.
This document discusses ways to enhance a Lync 2013 rollout to ensure business success. It covers additional products like contact centers, call recording, operator consoles and SIP trunking that complete the unified communications vision. It also addresses meeting legal requirements, integrating with CRM systems, custom development, and ensuring user adoption of the new tools. The presentation aims to help attendees understand supporting products and services available to enhance a voice rollout and maximize the value of their Lync investment.
The document discusses various voice deployment features in Microsoft Lync Server 2010 including call park, unassigned number routing, E911 location services, private lines, caller ID controls, phone infrastructure requirements, voice routing considerations, and mediation server consolidation. Planning items covered include branch resiliency, datacenter resiliency, call admission control, topology changes, enhanced 911 for North America, analog devices, common area phones, and malicious call trace.
Systematic Migration of Monolith to MicroservicesPradeep Dalvi
MicroServices architecture style and its advantages over Monolith Application are known. I've tried to list down guidelines with respect to MicroServices when we want to split the Monolith.
A webinar introduction to FreeSBC, explaining the unique business model, product features, use cases and a customer case study. A great overview to help prospective users get oriented.
This document provides an overview and summary of Office 365 plans for businesses of various sizes and needs. It describes the core features included in different plans such as email, online meetings, file storage and sharing, as well as additional capabilities available for extra costs like email archiving, enterprise mobility management, and 24/7 phone support. It also introduces new Office 365 plans tailored for kiosk and frontline workers with limited computer access needs.
What are customer centric networks
How to assemble them using SDN and NFV
What type of customer centric services you can build with a programmable network
How to Monitor DOCSIS Devices Using SNMP, InfluxDB, and TelegrafInfluxData
Wide Open West Is one of the US' top broadband providers with over 3,000 employees. They aim to connect residential homes and businesses to the world with fast and reliable internet, TV and phone services. WOW uses SNMP and Telegraf to collect network data from cable modems and metrics from VMs/containers; they use Kafka to stream all time-stamped data to InfluxDB. Kapacitor is used to send alerts to Slack, ServiceNow, and email. Discover how WOW is using a time series platform to collect, monitor, and alert on their entire service delivery network.
Join this webinar as Peter Jones and Dylan Shorter dives into:
WOW's approach to reducing infrastructure downtime and improving service uptime
Observability and alerting best practices
How they use the InfluxDB platform to monitor 600K + devices
This document introduces FreeSBC, a new software-based session border controller (SBC) product from TelcoBridges that offers core SBC functions for free using a "freemium" business model. The free version provides basic SBC capabilities while a paid version offers additional features like 24/7 support and analytics tools. FreeSBC is targeted at carriers and enterprises looking for a highly scalable and flexible SBC solution with a low total cost of ownership.
Rich Miller & Surendra Reddy
Lighthouse is a concept for the creation of an intercloud registry service, based on (1) access points established and maintained by cloud instances to disseminate operational metadata; and, (2) the use of publish/subscribe (pub/sub) asynchronous messaging as the dominant means of disseminating operational metadata among the constituents of the intercloud.
The document discusses the objectives and concepts of the Lighthouse Intercloud Metadata Service. Lighthouse aims to enable the dissemination and exchange of operational metadata among clouds and between cloud services and consumers. It proposes that each cloud provider operates an autonomous Metadata Access Point to publish information about itself. A registry of these access points would then provide a mechanism for discovery and coordination across clouds. The document outlines several use cases and requirements for Lighthouse and reviews existing standards that could potentially be leveraged or extended to support its goals.
The document discusses Python decorators and namespaces. It begins by asking if the reader understands what happens when a function or class is decorated using the @ symbol. It then explains that it will look at how decorators work under the hood and discuss Python namespaces and scopes. It provides examples of recursive functions and how namespaces allow the same name to be used in different contexts without conflict.
After 4 years working with MongoDB I moved to a new project using a traditional SQL database. People don't value what they have until they lose it, and even though I loved MongoDB, I never thought I would miss it so much. In this talk I will explain you why.
Session in Python Bcn Meetup, Beginners session
>> Contents:
- Python execution model
- Everything is an object
- Everything is a pointer (or a reference)
- Mutable vs. immutable objects
- Common errors
>> Code examples:
https://github.com/pablito56/pybcn-beginners
At Telefonica PDI we are developing an internal messaging service to be used by our own products.
Sprayer is a low latency, reliable messaging system supporting delivery of messages to a single receiver, predefined group of receivers or specific list of receivers over different channels (SMS, HTTP, WebSockets, Email, Android, iOS and Firefox OS native push…). We are using Redis, MongoDB and RabbitMQ to implement Sprayer. In this talk we will review Sprayer’s architecture.
We will see for each of these technologies, why, where and for what they are used as well as some tips.
Talk done together with Javier Arias ( @javier_arilos ) at NoSQL Matters Barcelona 2013.
Everyone knows Python's basic datatypes and their most common containers (list, tuple, dict and set).
However, few people know that they should use a deque to implement a queue, that using defaultdict their code would be cleaner and that they could be a bit more efficient using namedtuples instead of creating new classes.
This talk will review the data structures of Python's "collections" module of the standard library (namedtuple, deque, Counter, defaultdict and OrderedDict) and we will also compare them with the built-in basic datatypes.
Do you know what’s happening each time you use the @ (at) symbol to decorate a function or class?
In this talk we are going to see how Python’s decorators syntactic sugar works under the hood.
In this talk we will make a tour through the most important changes and new features in the language and its standard library, such as enums, single-dispatch generic functions, Tulip, yield from, raise from None, contextlib... And yes, we will talk about Python 3.
GitHub repository with the code of the examples: https://github.com/pablito56/coolest_is_yet_to_come
2. Content
Introduction
• Telefónica PDI. Who?
01
• Personalisation Server. Why? What?
The SQL version
• Data model and architecture
02
• Integrations, problems and improvements
The NoSQL version
• Data model and architecture
03
• Performance boost
• The bad
Conclusions
• Conclusions
04
• Personal thoughts
4. 01
Telefónica PDI. Who?
• Telefónica
§ Fifth largest telecommunications company in the world
§ Operations in Europe (7 countries), the United States and Latin America
(15 countries)
• Telefónica Digital
§ Web and mobile digital contents and services division
• Product Development and Innovation unit
§ Formerly Telefónica R&D
§ Product & service development, platforms development, research,
technology strategy, user experience and deployment & operation
§ Around 70 different on going projects at all time.
Telefónica PDI
4
6. 01
Opt-in and profile module. Why?
• Users data, profile and permissions, was scattered across different
storages
• Gender
IPTV service
• Film and music preferences
So you want to
Mobile • Permission to contact by SMS?
know my
service
• Gender
address…
AGAIN?!
Music tickets • Address
service
• Music preferences
Location • Address
based offers
• Permission to contact by SMS?
Telefónica PDI
6
7. 01
Opt-in and profile module. Why?
• Users data, profile and permissions, was scattered across different
storages
• Gender
IPTV service
• Film and music preferences
Mobile • Permission to contact by SMS?
service
• Gender
Music tickets • Address
service
• Music preferences
Location • Address
based offers
• Permission to contact by SMS?
Telefónica PDI
7
8. 01
Opt-in and profile module. Why?
• Provide a module to become master
customer’s data storage
• Gender
IPTV service
• Film and music
preferences
• Permission to contact
Mobile
by SMS?
service
• Address
Music tickets
service
Location
based offers
Telefónica PDI
8
9. 01
Opt-in and profile module. What?
• Features:
§ Flexible profile definition, classified in services
§ Profile sharing options between different services
§ Real time API
§ Supplementary offline batch interface
§ Authorization system
§ High availability
§ Inexpensive solution & hardware
Telefónica PDI
9
11. 02
Data model
Services, users and their profile
• Services defined a set of attributes (their profile), with default
value and data type
• Users were registered in services
• Users defined values for some of the services attributes
• Each attribute value had an update date to avoid overwriting newer
changes through batch loads
Telefónica PDI
11
12. 02
Data model
Services profile sharing matrix
• Services could access attributes declared inside other services
• There were sharing rights for read or read and write
• The user had to be registered in both services
Telefónica PDI
12
13. 02
Data model
Authorization system
• Everything that could be accessed in the PS was a resource
• Roles defined access rights (read or read and write) of resources
• Auth users had roles
• Roles could include other roles
Telefónica PDI
13
14. 02
Data model
Bonus features!
• Multiple IDS:
§ Users profile could be accessed with different equivalent IDs depending
on the service
§ Each user ID was defined by an ID type (phone number, email, portal ID,
hash…) and the ID value
Telefónica PDI
14
15. 02
High level logical architecture
§ Everything running on Red Hat EL 5.4 64 bits
Telefónica PDI
15
16. 02
High level logical architecture
§ Everything running on Red Hat EL 5.4 64 bits
Telefónica PDI
16
17. 02
Integration
Planned integration
• PS replaces all customers profile and
permissions DBs
• All systems access this data through
PS real time API
• In special cases, some PS-consumers
could use the batch interface.
• The same way new services could be
added quite easily
Telefónica PDI
17
18. 02
Integration
Problems arise
• Budget restrictions: adapt all services
to use the API was too expensive
• Keep independent systems DBs and
synchronize PS through batch
• Use DBs built-in massive extraction
feature to generate daily batch files
• However… in most cases those DBs
were not able to generate Delta
(only changes) extractions
§ Provide full daily snapshots!
Telefónica PDI
18
19. 02
First version performance
Ireland
• 1.8M customers, 180 profile attributes, 6 services
• Sizes
§ Tables + indexes size: 65Gb
§ 30% of the size were indexes
• Batch
§ Full DWH customer’s profile import: > 24 hours
§ Delta extractions: 4 - 6 hours
§ Loads and extractions performance proportional to data size
• API:
§ Response time with average traffic: 110ms
Telefónica PDI
19
21. 03
Second version
High level logical architecture
• New approach: batch processes access directly DB
Telefónica PDI
21
22. 03
Second version
Batch processes
• Batch processes had to
§ Validate authentication and authorization
§ Verify user, service and attribute existence
§ Check equivalent IDs
§ Validate sharing matrix rights
§ Validate values data type
§ Check the update date of the existing values
Telefónica PDI
22
23. 03
Second version
DB Batch processing
BAs
O ur D
Telefónica PDI
23
24. 03
Second version
New DB-based batch loading process
• Preprocess incoming batch file in BE servers
§ Validate format, services and attributes existence and values data types
§ Generate intermediate file with structure like target DB table
• Load intermediate file (Oracle’s SQL*Loader) to a temporal table
• Switch DB to “deferred writing”, storing all incoming modifications
• Merge temporal table and final table, checking values update date
• Replace old users attributes values table with merge result
• Apply deferred writing operations
Telefónica PDI
24
25. 03
Second version
New batch extraction process
• Generate a temporal DB table with format similar to final batch file.
Two loops over users attributes values table required:
§ Select format of the table; number and order of columns / attributes
§ Fill the new table
• Loop the whole temporal table for final formatting (empty fields…)
• From batch side loop across the whole table (SELECT * FROM …)
• Write each retrieved row as a line in the resulting file
Telefónica PDI
25
26. 03
Second version performance
Ireland performance requirements
• Batch time window: 3:30 hours
§ Full DWH load
§ Two Delta loads
§ Three Delta extractions
• API:
§ Ireland requirement: < 500ms
Telefónica PDI
26
27. 03
Second version performance
Ireland
• 1.8M customers, 180 profile attributes, 6 services
• Sizes
§ Tables + indexes size: 65Gb
§ 30% of the size were indexes
§ Temporal tables size increases almost exponentially: 15Gb and above
§ Intermediate file size: from 700Mb to 7Gb
• Batch
§ Full DWH customer’s profile import: 2:30 hours
§ Delta extractions: 1:00 hour
§ Loads performance worsened quickly (almost exp): 6:00 hours
§ Extractions performance proportional to data size
§ Concurrent batch processes may halt the DB
• API:
§ Response time with average traffic: 80ms
§ Response time while loading was unpredictable: >300ms
Telefónica PDI
27
29. 04
Third version
Speed up DB Batch processes
gain)
A s (a
Our DB
Telefónica PDI
29
30. 04
Third version
New (second) DB-based batch loading process
• Minor preprocessing of incoming batch file in BE servers
§ Just validate format
§ No intermediate file needed!
• Load validated file (Oracle’s SQL*Loader) to a temporal table
• Loop the temporal table merging the values into final table, checking
values update date and data types
§ Use several concurrent writing jobs
• Store results on real table, no need to replace!
• No “deferred writing”!
Telefónica PDI
30
31. 04
Third version
Enhancements to extraction process
• Optimized loops to generate temporal output table.
§ Use several concurrent writing jobs
§ We achieved a speed-up of between 1.5 and 2
• Loop the whole temporal table for final formatting (empty fields…)
• Download and write lines directly inside Oracle’s sqlplus
• No SELECT * FROM … query from Batch side!
Telefónica PDI
31
32. 04
Third version performance
Ireland
• 1.8M customers, 180 profile attributes, 6 services
• Sizes
§ Tables + indexes size: 65Gb
§ 30% of the size were indexes
§ Temporal tables: 15Gb
• Batch
§ Full DWH customer’s profile import: 1:10 hours (vs. 2:30 hours)
§ Three Delta extractions: 2:15 hours (vs. 3:00 hours)
§ Loads and extractions performance proportional to data size
§ Concurrent batch processes not so harmful
s
DBA
• API:
Our
F**K YEAH
§ Response time with average traffic: 110ms
§ Response time while loading: 400ms
Telefónica PDI
32
33. 04
Third version performance
United Kingdom
• 25M customers, 150 profile attributes, 15 services
• Sizes
§ Tables + indexes size: 700Gb
§ 40% of the size were indexes
• Batch
§ Two Delta imports: < 2:00 hours
§ Two Delta extractions: < 2:00 hours
§ Loads and extractions performance proportional to data size
• API:
§ Response time with average traffic: 90ms
s
DBA
Our
F**K YEAH
Telefónica PDI
33
34. 04
Third version performance
Ireland
3rd version
2nd version
DB size
65Gb + 15Gb (temp)
65Gb + > 15Gb
Full DWH load
1:10 hours
2:30 hours
Three Delta exports
2:15 hours
3:00 hours
Batch stability
Stable, linear
Unstable, exponential
API response time
110ms
110ms
API while loading
400ms
Unpredictable
United Kingdom
3rd version
DB size
700Gb
s
Two Delta loads
< 2:00 hours
DBA
Our
Three Delta exports
< 2:00 hours
F**K YEAH
API response time
90ms
Telefónica PDI
34
35. 04
Third version performance
DB stats
• 20 database tables
• API: several queries with up to 35 joins and even some unions
• Authorization: 5 joins to validate auth users access
• Batch:
§ Load: 1700 lines of PL/SQL
§ Extraction: 1200 of PL/SQL
Telefónica PDI
35
37. 04
Third version performance
Mexico
• 20M customers, 200 profile attributes, 10 services
• Mexico time window: 4:00 hours
§ Full DWH load!
§ Additional Delta feeds loads
§ At least two Delta extractions
D BAs
Our
Telefónica PDI
37
42. 05
MongoDB Data Model
DB stats
• Only 5 collections
• API: typically 2 accesses (services and users collections)
• Authorization: access only 1 collection to grant access
• Batch: all processing done outside DB
Telefónica PDI
42
43. 05
NoSQL version
High level logical architecture
§ Everything running on Red Hat EL 6.2 64 bits
Telefónica PDI
43
44. 05
NoSQL version performance
Ireland (at PDI lab)
• 1.8M customers, 180 profile attributes, 6 services
• Sizes
§ Collections + indexes size: 20Gb (vs. 65Gb)
§ < 5% of the size are indexes (vs. 30%)
• Batch
§ Full DWH customer’s profile import: 0:12 hours (vs. 1:10 hours)
§ Three Delta extractions: 0:40 hours (vs. 2:15 hours)
§ Loads and extractions performance proportional to data size
§ Concurrent batch processes without performance affection
• API:
§ Response time with average traffic: < 10ms (vs. 110ms)
§ Response time while loading: the same
§ High load (600 TPS) response time while loading: 300ms
Telefónica PDI
44
45. 05
NoSQL version performance
United Kingdom (at PDI lab)
• 25M customers, 150 profile attributes, 15 services
• Sizes
§ Collections + indexes size: 210Gb (vs. 700Gb)
§ < 5% of the size were indexes
• Batch
§ Two Delta imports: < 0:40 hours (vs. 2:00 hours)
§ Loads and extractions performance proportional to data size
Telefónica PDI
45
46. 05
NoSQL version performance
Mexico
• 20M customers, 200 profile attributes, 15 services
• Sizes
§ Collections + indexes size: 320Gb
§ Indexes size: 1.2Gb
• Batch
§ Initial Full import (20M, 40 attributes): 2:00 hours
§ Small Full import (20M, 6 attributes): 0:40 hours
• API:
§ Response time with average traffic: < 10ms (vs. 90ms)
§ Response time while loading: the same
§ High load (500 TPS) response time while loading: 270ms
Telefónica PDI
46
47. 04
NoSQL version performance
Ireland
NoSQL version
SQL version
DB size
20Gb
80Gb
Full DWH load
0:12 hours
1:10 hours
Three Delta exports
0:40 hours
2:15 hours
API while loading
< 10ms
400ms
API 600TPS + loading
300ms
Timeout / failure
United Kingdom
NoSQL version
SQL version
DB size
210Gb
700Gb
Two Delta loads
< 0:40hours
< 2:00 hours
Mexico
NoSQL version
DB size
320Gb
Initial Full load (40 attr)
2:00 hours
Daily Full load (6 attr)
0:40 hours
D BAs
Our
API while loading
< 10ms
API 500TPS
Telefónica PDI
+ loading
270ms
47
49. 05
The bad
• Batch load process was too fast
§ To keep secondary nodes synched we needed oplog of 16 or 24Gb
§ We had to disable journaling for the first migrations
• Labels of documents fields take up disc space
§ Reduced them to just 2 chars: “attribute_id” -> “ai”
• Respect the unwritten law of at least 70% of size in RAM
• Take care with compound indexes, order matters
§ You can save one index… or you can have problems
§ Put most important key (never nullable) the first one
• DBAs whining and complaining about NoSQL
§ “If we had enough RAM for all data, Oracle would outperform MongoDB”
Telefónica PDI
49
50. 05
The ugly
• Second migration once the PS is already running
§ Full import adding 30 new attributes values: 10:00 hours
§ Full import adding 150 new attributes values: 40:00 hours
• Increase considerably documents size (i.e. adding lots of new values
to the users) makes MongoDB rearrange the documents, performing
around 5 times slower
§ That’s a problem when you are updating 10k documents per second
• Solutions?
§ Avoid this situation at all cost. Run away!
§ Normalize users values; move to a new individual collection
§ Prealloc the size with a faux field
• You could waste space!
§ Load in new collection, merge and swap, like we did in Oracle
Telefónica PDI
50
52. 06
Conclusions & personal thoughts
• Awesome performance boost
§ But not all use cases fit in a MongoDB / NoSQL solution!
• New technology, different limitations
• Fear of the unknown
§ SSDs performance?
§ Long term performance and stability?
• Python + MongoDB + pymongo = fast development
§ I mean, really fast
• MongoDB Monitoring Service (MMS)
• 10gen people were very helpful
Telefónica PDI
52
55. 0X
SQL Physical architecture
§ Scale horizontally adding more BE or DB servers or disks in the SAN
§ Virtualized or physical servers depending on the deployment
Telefónica PDI
55
56. 0X
MongoDB Physical architecture
§ MongoDB arbiters running on BE servers
§ Scale horizontally adding more BE servers or disks in the SAN
§ Sharding may already be configured to scale adding more replica sets
Telefónica PDI
56