Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

MongoDB with RDBMS for a Portal Application

874 views

Published on

This webinar highlighted a real world case study of a leading networking company which leveraged MongoDB features such as Dynamic schema, Optimized storage engine (WiredTiger), Multi-geo replication and Tag-aware sharding to deliver 5x improvement in application response time and enhanced security.

Published in: Technology
  • Be the first to comment

MongoDB with RDBMS for a Portal Application

  1. 1. CIGNEX Datamatics Confidential www.cignex.com Webinar: MongoDB with RDBMS for a Portal Application To achieve Performance, Scalability and Data Privacy Date: 30th Sept 2015 Presenters: Nikhil Naib Big Data Solution architect CIGNEX Datamatics Nirav Shah Sr. Director – Marketing & Corporate Communication CIGNEX Datamatics
  2. 2. CIGNEX Datamatics Confidential www.cignex.com2 CIGNEX Datamatics: Established in 2000, USA | UK | India 8 Open Source Products#1 Pure Play Open Source Services Company 14 Open Source Books Authored Global Offices 13+Business Engagement Platforms5+ Open Source Community Contributions5000+Open Source Implementations500+Open Source Consultants500+ Portals, Content & Collaboration Portals Enterprise Integration Identity Relationship Management Enterprise Content Management Document & Web Content Management Learning/Knowledge Management Imaging and Scanning - OCR/Digitization Enterprise & NLP Search BPM/Workflow E-Commerce B2B B2C Internet of Things (IoT) Big Data Analytics Data Integration Information Delivery Data Analysis Solutions We Provide Business Engagement Platforms Panoramyx™ Big Data Blueprint Platform Vitalstatistyx™ IoT Platform DEEP™ Digital Employee Engagement Platform RMP™ Reputation Management Platform FMP™ Franchise Management Platform
  3. 3. CIGNEX Datamatics Confidential www.cignex.com As a solution architect, Nikhil identifies best-fit technology stack which is aligned with business needs of all stake holders and development team. With 12+ years of experience, he has been showcasing his expertise by delivering finished products and blueprints (POCs) with quick turn around time. As MongoDB Certified DBA, Nikhil has hands-on-experience working on 7 Medium to Large scale MongoDB implementations for the solutions such as e-commerce, content management, reputation management etc. Nikhil takes pleasure in imparting training on different Big Data technologies. Nikhil Naib
  4. 4. CIGNEX Datamatics Confidential www.cignex.com • Key Challenges of Enterprise Portal Application • MongoDB – The Leading NoSQL Database • Case Study: Global e-learning Platform – Solution architecture – Approach & Best Practices • Storage Engine, Schema Design, Data Migration, Sharding, Performance & Monitoring – Benefits • Best Practices & Learning – Augmentation with MongoDB • Q & A 4 Today’s Topics
  5. 5. CIGNEX Datamatics Confidential www.cignex.com5 Key Challenges of Enterprise Portal Application Analytical & Operational Processing on same Data reducing performance Scale according to the business needs Proprietary Database with higher TCO Global Application with Geography specific Data 1000’s – millions queries / sec) - reads & writes Agile Application Rollouts Does your application face any of following challenge ?
  6. 6. CIGNEX Datamatics Confidential www.cignex.com6 Solution: Augment SQL with NoSQL Types One Type (Minor Variations) Many Key-value stores Document databases Wide-column stores Graph databases Examples & More Schema Design Define Structure and data types in advance Dynamic Scalability Vertically Horizontally Data Querying Select, Insert, and Update statements Object-oriented APIs & More.. SQL Databases NoSQL Databases
  7. 7. CIGNEX Datamatics Confidential www.cignex.com7 MongoDB – The Leading NoSQL Database Reduces Operational Overhead up to 95% Auto-sharding with global distribution up to 50 Replica set members 7-10Xbetter write performance Up to 80%less storage with compression 5,000,000+ Downloads 600+ Customers Deployment Automation Integrated Caching Dynamic Schema Design Source: www.mongodb.com
  8. 8. CIGNEX Datamatics Confidential www.cignex.com8 Augment Your Portal Application Database with MongoDB Augmented Database Architecture Application’s RDBMS Database Infrastructure (OS & Virtualization, multi- data center deployment) Monitoring&Management Security&Auditing Portals, Content & Collaboration Enterprise Content Management Big Data Analytics e-Commerce Portals MonogDB Shards Shard1 Shard1 Shard n
  9. 9. CIGNEX Datamatics Confidential www.cignex.com9 CIGNEX Datamatics’ MongoDB Solutions Single View of “X” (Customer, Employee, Partner and more) Internet of Things Product Catalogue Data Hub Personalization User Data Management Reputation Management Social Listening Content Management & Delivery
  10. 10. CIGNEX Datamatics Confidential www.cignex.com Case Study Global e-Learning Platform Efficient User Data Management 5x Improvement in Content Management & Delivery Data Privacy with Geographically Distributed Data 10
  11. 11. CIGNEX Datamatics Confidential www.cignex.com Client Overview 11 • Large networking company that designs, manufactures, and sells networking equipment – Group: Corporate Affairs division which invests in scalable and self-sustaining programs that use technology to meet some of society's biggest challenges • CSR Program: An e-learning portal offering IT skills and career building program to learning institutions and individuals worldwide – 2M+ Students across 160+ countries – 20,000 instructors – 146 million online exams conducted so far
  12. 12. CIGNEX Datamatics Confidential www.cignex.com Challenges and Proposed Augmentation with MongoDB (NoSQL) 12 Restricted Scalability Performance Issues (6-16 Sec Response Time) Complex Queries (Organization-to-user relationship takes 3 to 4 table joins) Highly Normalized schema Optimized Storage Engine with highly de- normalized data with embedded documents Replication & sharding to support Horizontal Scalability & High Availability Document oriented database with dynamic schema supporting many data types Tag Aware sharding facilitating Geo-Awareness to app data to comply with data privacy laws and reduce network latency Challenges with existing RDBMS Proposed Augmentation
  13. 13. CIGNEX Datamatics Confidential www.cignex.com13 Proposed Portal Application Architecture Jerysey – RESTful Web Services in Java PostgreSQL – Liferay Application Data MongoS Configuration Server User data storage & processing Global Learning & knowledge Sharing Platform Mongod Geo 1 Mongod Geo 2 Mongod Geo 3 Liferay Portal Application Server Custom Tables & Fields
  14. 14. CIGNEX Datamatics Confidential www.cignex.com14 Approach – Augmenting Application RDBMS with MongoDB Storage Engine Performance & Monitoring Sharding Data Migration Schema Design MongoDB 3.0 Data Storage Engine - WiredTiger
  15. 15. CIGNEX Datamatics Confidential www.cignex.com15 Approach – Augmenting Portal Application’s RDBMS with MongoDB Why WiredTiger ? 1. Document level locking 2. Data compression (Up to 80% with Snappy algorithm and up to 90% data compression using zlib, indexes compression up to 50%) 3. 7x-10x higher throughput than previous version. 4. Ability to saturate all the CPU cores and an ability to store indexes and data on separate mounts for optimum utilization of IOPS. 5. 100% backwards compatible 6. Non-disruptive upgrade (no downtime while migration) Best Practices: 1. Use XFS file system with WiredTiger as there are known issues with WiredTiger on ext4 2. Ensure to stay up to date with minor version releases of 3.0 as it has important fixes for both WiredTiger & Sharding Storage Engine Performance & Monitoring Sharding Data Migration Schema Design
  16. 16. CIGNEX Datamatics Confidential www.cignex.com16 Approach – Augmenting Portal Application’s RDBMS with MongoDB Approach: 1. Understand functionality of each Liferay Portlet. 2. Understand external data points and nature of the data (structured & unstructured). 3. Understand RDBMS Schema design including each table and fields. 4. Understand SQL queries covering all CRUD operations along with triggers, views, cursors, stored procedures. 5. Create a schema for MongoDB collections. Storage Engine Performance & Monitoring Sharding Data Migration Schema Design
  17. 17. CIGNEX Datamatics Confidential www.cignex.com17 Approach – Augmenting Portal Application’s RDBMS with MongoDB create table LOCATION_Country ( id_ LONG not null primary key, name VARCHAR(75) null, ageLimit INTEGER, coppaAgeLimit INTEGER, isoCountryCode VARCHAR(75) null, verified BOOLEAN, embargo BOOLEAN ); create table LOCATION_State ( id_ LONG not null primary key, name VARCHAR(75) null, isoStateCode VARCHAR(75) null, isoCountryCode VARCHAR(75) null, verified BOOLEAN ); create table LOCATION_City ( id_ LONG not null primary key, name VARCHAR(100) null, displayName VARCHAR(100) null, isoStateCode VARCHAR(75) null, isoCountryCode VARCHAR(75) null, population LONG, latitude VARCHAR(75) null, longitude VARCHAR(75) null, verified BOOLEAN ); Collection:location { { _id : ObjectID generated by MongoDB, displayCountryId : "id from location_country", displayCountryName : Name from location country ageLimit : "agelimit from location_country", coppAgeLimit : "coppagelimit from location_country", isoCountryCode: "isocountrycode from location_country", countryVerificationStatus : "verified from location_country", embargo : "embargo from location_country", } { _id : ObjectID generated by MongoDB, displayCountryId : "id from location_country",, displayStateId : "id_ from location_state", displayStateName : "name from location_state", isoStateCode : "isostatecode from location_state", stateVerificationStatus : "verified from location_state", } { _id : ObjectID generated by MongoDB, countryId : "id from location_country", stateId : "id_ from location_state", cityId : "city_id from location_city", cityName : "name from location_city", displayName : "displayname from location_city", population: "population from location_city", lattitude: "lattitude from location_city", longitude: "longitude from location_city", cityVerificationStatus : "verified from location_city“ } } PostgreSQL Schema MongoDB Schema Storage Engine Performance & Monitoring Sharding Data Migration Schema Design
  18. 18. CIGNEX Datamatics Confidential www.cignex.com18 Approach – Augmenting Portal Application’s RDBMS with MongoDB Best Practices & Learning: 1. Analyze Data Access Patterns of the application. 2. Define Indexes by identifying common queries. 3. Use explain method & MongoDB profiler for query optimization. 4. Create the collections to de-normalize the schema for optimal performance. 5. Reconsider the schema design for collection once the number of indexes on the collection reaches 10. 6. Carefully design and tune the connection pooling strategy for your application. 7. There is no support for transactions in MongoDB per say but there are workarounds available (https://docs.mongodb.org/v3.0/tutorial/perform-two-phase-commits/) Storage Engine Performance & Monitoring Sharding Data Migration Schema Design
  19. 19. CIGNEX Datamatics Confidential www.cignex.com19 Approach – Augmenting Portal Application’s RDBMS with MongoDB Approach: 1. Create SQL queries for the migration scripts 2. Use MongoDB Java Driver for interacting with Mongo & JDBC driver for talking to PostgreSQL 3. Execute the migrations scripts against RDBMS and fill the MongoDB collections 4. We hosted RDBMS on read optimized instance which can expedite the execution of migration queries. Storage Engine Data Migration Performance & Monitoring Sharding Schema Design Java Based Custom ETL Tool Application RDBMS Database External Data Sources (social media, Salesforce) ETL Tool
  20. 20. CIGNEX Datamatics Confidential www.cignex.com20 Approach – Augmenting Portal Application’s RDBMS with MongoDB Best Practices: 1. Use Bulk API of MongoDB for Bulk ingestion 2. Leverage Java’s support for multithreading for concurrent inserts to reduce the data migration time 3. Migration process should be fault tolerant. The process should only begin from where it had left and NOT from scratch. 4. Reuse infrastructure with migration scripts deployed on the same server as services layer Storage Engine Data Migration Performance & Monitoring Sharding Schema Design
  21. 21. CIGNEX Datamatics Confidential www.cignex.com21 Approach – Augmenting Portal Application’s RDBMS with MongoDB Approach: Use “Tag Aware” sharding which brought Geo-Awareness to the application data. Storage Engine Data Migration Performance & Monitoring Sharding Schema Design The ideal shard key : 1. High cardinality which makes it easy for MongoDB to split the chunks. 2. Higher “randomness” 3. Targeted queries 4. May need to be computed
  22. 22. CIGNEX Datamatics Confidential www.cignex.com PostgreSQL – Liferay RDBMS 22 Approach – Augmenting Portal Application’s RDBMS with MongoDB Data Tier Geographyn AppServer Geo 1 Mongod Primary Geo 3 Secondary Geo 2 Secondary mongod Config ServerApp Tier Geo 1 Mongod Secondary mongod Arbiter mongos Geography1 AppServer Geo 2 Mongod Primary Geo 3 Secondary Geo 1 Secondary mongod Geo 2 Mongod Secondary mongod Arbiter mongos Geography2 AppServer Geo n Mongod Primary Geo 2 Secondary Geo 1 Secondary mongod Geo n Mongod Secondary mongod Arbiter mongos Geo n-1 Secondary Storage Engine Data Migration Performance & Monitoring Sharding Schema Design LoadBalancer
  23. 23. CIGNEX Datamatics Confidential www.cignex.com23 Approach – Augmenting Portal Application’s RDBMS with MongoDB Best Practices: 1. Plan for sharding well in advance and ensure to test the same with production like workloads to see the impact. 2. Involve the Network Administration team while planning for sharding as they are the go to guys for cross data center connectivity issues. 3. Deploy shard router on the application server so that one call over the network can be saved. 4. Balance the non-sharded collections across the different shards so that all the shards receive the same amount of traffic. 5. Ensure that indexes fit well in RAM: index size < RAM 6. Use appropriate write concern and read preference based on the use case. Choosing appropriate read preference helps to scale the reads and also deal with the network latency issues. 7. Use of replica set tag-sets in combination with appropriate write concern & read preference help a lot to address the data privacy concerns Storage Engine Data Migration Performance & Monitoring Sharding Schema Design
  24. 24. CIGNEX Datamatics Confidential www.cignex.com Performance Test Results 24 Approach – Augmenting Portal Application’s RDBMS with MongoDB Storage Engine Data Migration Performance & Monitoring Sharding Schema Design Functionality Only RDBMS Augmented RDBMS with MongoDB Portal Dashboard ~ 16 sec ~ 3- 4 sec User Enrollment ~ 17 sec ~ 2-3 sec User Profile ~ 10 sec ~ 3 sec Instructor – Dashboard ~ 13 sec ~ 3 sec Course Assignment ~ 12 sec ~ 2-3 sec Search ~ 29 sec ~ 3- 4 sec Note: Performance depends on many parameters such as network latency, server configuration, number of concurrent users and more.
  25. 25. CIGNEX Datamatics Confidential www.cignex.com • Cluster Management & Monitoring Approach – MMS - MongoDB Monitoring and Management Service Tool • Automation – Provision, Upgrade & Scale • Backups – Continues backup – Point-in-Time Recovery • Monitoring – Dashboard with alerts 25 Approach – Augmenting Portal Application’s RDBMS with MongoDB Storage Engine Data Migration Performance & Monitoring Sharding Schema Design
  26. 26. CIGNEX Datamatics Confidential www.cignex.com MongoDB MMS –Dashboard 26 Approach – Augmenting Portal Application’s RDBMS with MongoDB Storage Engine Data Migration Performance & Monitoring Sharding Schema Design
  27. 27. CIGNEX Datamatics Confidential www.cignex.com • Performance – Web page response time reduced from 6-16 sec to < 2 Sec – Migration of 500GB+ data completed in 4 hours – Geo based Tag Aware sharding to reduce the network latency, as the data can stay close to the application server • Scalability – Auto Sharding to accommodate new geographies • Data Privacy – Geo based Tag Aware sharding to comply with the data privacy laws of different countries/zones • Lower TCO – Open Source technology reduced licensing costs and vendor dependency at accelerated speed to development with out-of-box features. 27 Benefits - Delivered with CIGNEX Datamatics Expertise
  28. 28. CIGNEX Datamatics Confidential www.cignex.com28 Best Practices & Learning – Augmentation with MongoDB MongoDB scales & shines !! 2 Plan early for sharding. DO NOT go to production without benchmarking the shard key. 3 Do not forget to set ulimit & noatime. They provide significant performance gains. Identify indexes carefully. More number of Indexes can bring down the write throughput1 Use Bulk API of MongoDB Easy for bulk ingestion. 4 6 Use MongoDB Ops manager Monitoring & managing a sharded cluster is a painful process without Ops Manager. 7 Use SSDs & RAID-10 They provide excellent throughput. Use Java’s support for multithreading Reduces the data migration time for concurrent inserts. Use MongoDB Enterprise Edition It provides excellent support and high class security features.8 5
  29. 29. CIGNEX Datamatics Confidential www.cignex.com29 CIGNEX Datamatics - Big Data Analytics & IoT Case Studies Improve performance through real-time intelligence by efficient device management. & issue identification GPS Services Company Networking Company Increase customer satisfaction & revenue due to uninterrupted video experience anywhere anytime on any device Modernization of legacy Quote Portal resulting into competitive advantage – Quote in 5 minutes Insurance Company First mover advantage with timely launch of Sentiment and Trending Analysis service SaaS Start-up Company B2B Market Intelligence Services 100% Increase in Conversion Rate with Single View of Business and Market Intelligence E-Learning Community Portal 5x Efficient User Data Management with Improved application performance and data security
  30. 30. CIGNEX Datamatics Confidential www.cignex.com30 CIGNEX Datamatics Big Data Analytics Expertise Team Size: 70+ • 70+ Certified Global Consultants on various Big Data Technologies • Partnership - MongoDB Advanced Partner, Talend Gold Partner, Cloudera Authorized Partner • Frameworks • Panoramyx™ • Vitalstatistyx™ • Platforms • Reputation Management Platform™ • Findability Platform™ • Services • OpeRA™ - Feasibility Study/ Assessment, Workshops • Big Data Consulting (MongoDB, Hadoop, Talend) • Development to Production • Support & Maintenance Big Data Platforms Business Intelligence Expertise
  31. 31. CIGNEX Datamatics Confidential www.cignex.com31 CIGNEX Datamatics: Big Data Analytics & IoT Expertise PanoramyxTM Hybrid Data Lake Architecture Data “Blending” & “Enrichment Key “Analytics Engines” Single “Panoramic” view of X X = Customer, Employee, Partner and more OpeRA™ Open Source Readiness Assessment for Big Data Analytics Blueprint for Legacy Modernization Migration/Adoption Guidelines Recommendations for TCO Reduction, Performance Optimization VitalstatistyxTM Internet of Things (IoT) Reference Architecture APIs Integration with Sensors, Devices Hardware Vendor connect Real-time & Predictive analytics with Reports/Dashboard Analytics Engines Powerful, Flexible Analytics “Engines” & Point Solutions Findability Platform (Social Listening & Personality Insights) Reputation Management Platform Systems of Insight
  32. 32. CIGNEX Datamatics Confidential www.cignex.com32 Q & A 1) Quick Assessment – http://operaonline.cignex.com 2) Test Drive Big Data Analytics Engage us for Proof-of-Concept (PoC) @ US5K
  33. 33. CIGNEX Datamatics Confidential www.cignex.com Thank you www.cignex.com Contact Us Sales: sales@cignex.com | Jobs – jobs@cignex.com | Others – info@cignex.com facebook.com/CIGNEXTechnologies youtube.com/cignexglobaltwitter.com/cignexwww.cignex.com

×