How companies use NoSQL and Couchbase

  • 2,746 views
Uploaded on

Slides from my session at the Global Big Data Conf January 2013

Slides from my session at the Global Big Data Conf January 2013

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,746
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
0
Comments
0
Likes
4

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. How companies use NoSQL and Couchbase Dipti Borkar Director, Product Management
  • 2. Couchbase Server 2.0NoSQL DocumentDatabase
  • 3. Couchbase NoSQL Open Source Technology• Vibrant and growing community focused on distributed database technology• Supports both key-value and document- oriented use cases• Community contributes via core server and client development, connectors and plugins, usability feedback, docs and forums, etc.• All project components available under Open Source Project the Apache Public License• Packaged software available from Couchbase, Inc. – Community and enterprise editions
  • 4. NoSQL + Big Data Operational database for web and mobile apps with high performance at scale Map-reduce against huge datasets to analyze and find insights and answers
  • 5. Couchbase Server Easy Consistent High Scalability PE RFORM ANCE Performance Grow cluster without Consistent sub-millisecondapplication changes, without read and write response timesdowntime with a single click with consistent high throughput Always Flexible Data On JSON JSON JSO JSON JSON N Model 24x365No downtime for software JSON document model with upgrades, hardware no fixed schema. maintenance, etc.
  • 6. Flexible Data Model { “ID”: 1, “FIRST”: “Dipti”, “LAST”: “Borkar”, “ZIP”: “94040”, “CITY”: “MV”, “STATE”: “CA” } JSON JSON JSON JSON• No need to worry about the database when changing your application• Records can have different structures, there is no fixed schema• Allows painless data model changes for rapid application development
  • 7. New in 2.0 JSON support Indexing and Querying JSON JSON JSO JSON N JSONIncremental Map Reduce Cross data center replication
  • 8. Additional Couchbase Server FeaturesBuilt-in clustering – All nodes equal Append-only storage layerData replication with auto-failover Online compactionZero-downtime maintenance Monitoring and admin API & UIBuilt-in managed cached SDK for a variety of languages
  • 9. Common Use Cases Social Gaming Ad Targeting Session store • Couchbase stores player and game • Couchbase stores • Couchbase Server as a key- data user information for value store fast access • Examples • Examples customers include: customers include: • Examples customers Concur, Sabre Zynga include: • Tapjoy, Ubisoft, Ten AOL, Mediamind, Co cent nvertro Content & Metadata User Profile Store Store Mobile Apps • Couchbase document store • Couchbase Server as a with Elastic Search• Couchbase stores user key-value store info and app content • Examples customers • Examples customers include: McGraw Hill• Examples customers include: Tunewiki include: Kobo, Playtika 3rd party aggregation data High availability cache • Couchbase stores social media and data feeds• Couchbase Server used as a cache tier replacement • Examples customers include: Sambacloud• Examples customers include: Orbitz
  • 10. Common Use Cases Social Gaming Ad Targeting Session store • Couchbase stores player and game • Couchbase stores • Couchbase Server as a key- data user information for value store fast access • Examples • Examples customers include: customers include: • Examples customers Concur, Sabre Zynga include: • Tapjoy, Ubisoft, Ten AOL, Mediamind, Co cent nvertro Content & Metadata User Profile Store Store Mobile Apps • Couchbase document store • Couchbase Server as a with Elastic Search• Couchbase stores user key-value store info and app content • Examples customers • Examples customers include: McGraw Hill• Examples customers include: Tunewiki include: Kobo, Playtika 3rd party aggregation data High availability cache • Couchbase stores social media and data feeds• Couchbase Server used as a cache tier replacement • Examples customers include: Sambacloud• Examples customers include: Orbitz
  • 11. Use Case: Content and Metadata StoreContent and Metadata Store Types of Data Application Requirements • Content metadata • Flexibility to store any kind of content • Content: Articles, text • Fast access to content metadata • Landing pages for website (most accessed objects) and content • Digital content: • Full-text Search across data set eBooks, magazine, research material • Scales horizontally as more content gets added to the system Why NoSQL and Couchbase • Fast access to metadata and content via object-managed cache • JSON provides schema flexibility to store all types of content and metadata • Indexing and querying provides real-time analytics capabilities across dataset • Integration with ElasticSearch for full-text search • Ease of scalability ensures that the data cluster can be grown seamlessly as the amount of user and ad data grows
  • 12. MCGRAW HILL EDUCATION LABS LEARNING PORTAL
  • 13. Use Case: Content and metadata store Building a self- adapting, interactive learning portal with Couchbase
  • 14. The ProblemAs learning move online in great numbersGrowing need to build interactive learning environments that 0101001001 1101010101Scale! 0101001010 101010Scale to millions of Serve MHE as well as third-party Including Support Self-adapt vialearners content open content learning apps usage data
  • 15. The ChallengeBackend is an Interactive Content Delivery Cloud that must: • Allow for elastic scaling under spike periods • Ability to catalog & deliver content from many sources • Consistent low-latency for metadata and stats access • Require full-text search support for content discovery • Offer tunable content ranking & recommendation functions Experimented with a combination of: XML Databases In-memory Data Grids SQL/MR Engines Enterprise Search Servers
  • 16. The Learning Portal • Designed and built as a collaboration between MHE Labs and Couchbase • Serves as proof-of-concept and testing harness for Couchbase + ElasticSearch integration • Available for download and further development as open source code
  • 17. Techniques Used• Document Modeling• Metadata & Content Storage• View Querying to support Content Browsing• Elastic Search Integration (Full Text Search) - Content Updated in near Real-Time - Search Content Summaries - Relevancy boosted based on User Preferences• Real-Time Content Updates• Event Logging for offline analysis
  • 18. Couchbase 2.0 + Elasticsearch Store full-text articles as well Continuously accept updates1 as document metadata for 3 from Couchbase with new image, video and text content in content & stats Couchbase Logs user behavior to calculate Combine user preferences2 user preference statistics (e.g. 4 statistics with custom relevancy scoring to provide video > text) personalized search results
  • 19. Data Model • Stores content metadata for media objects and content for articles Content Metadata • Includes Bucket tags, contributors, type information • Includes pointer to the media • Stores user view details per type User Profiles • Updated every time a user Bucket views a doc with running count • To be used for customizing ES search results per user preference • Stores content view details Content Stats • Updated for every time a Bucket document is viewed • To be used for boosting ES search results based on popularity
  • 20. Architecture App ServerExternal Media Store Couchbase Ruby SDK queries over HTTP ES Data Refs View Query TS Query Couchbase ES Transport Via XDCR MR Views MR Views MR Views MR Views Elastic Search Cluster Couchbase Server Cluster
  • 21. Use Case: Social GamingSocial and Mobile Gaming Types of Data Application Requirements • User account information • Ability to support rapid growth • User game profile info • Fast response times for awesome • User’s social graph user experience • State of the game • Game uptime –24x7x365 • Player badges and stats • Easy to update apps with new features Why NoSQL and Couchbase • Scalability ensures that games are ready to handle the millions of users that come with viral growth. • High performance guarantees players are never left waiting to make their next move. • Always-on operations means zero interruption to game play (and revenue) • Flexible data model means games can be developed rapidly and updated easily with new features
  • 22. SOCIAL GAMING ATTENCENT STOMP GAMES
  • 23. Use Case: Social gaming Building a social game with an awesome user experience that can scale to millions of players
  • 24. The ProblemSocial gaming is all about the experienceApplications needs- User centric data (read key-value access)- Scalability- Easy and simple backend
  • 25. The ChallengeBackend must be a platform for multiple games • Must be scalable • Highly available • Extreme performance (latency and throughput) • Cost effective • Operationally easy to maintain Experimented with several databases Couchbase DBShards MongoDB MySQL Cluster
  • 26. Evaluations considerations Couchbase MongoDB dbShards MySQL Cluster (NDB)Sharding strategyReplicationFailover supportScalabilityCustomized data supportSystem compatibilityCoding effortPerformanceProtocolUpgrade difficultyData persisting methodMap Reduce / JoinSQL compatibleLicensing PriceBulk priceManagement / monitor toolHardware requirementSupported OSOperation knowledgeOperation trainingOperation difficultyDeveloper company sizeMarket penetrationSupportSuccessful use cases
  • 27. The architecture
  • 28. Use Case: High availability cachingHigh availability caching Types of Data Application Requirements • Application objects • Consistently low response times • Popular search query for document / key lookups results • High-availability - 24x7x365 • Session information • Operationally easy to migrate / • Heavily accessed web upgrade / maintain with app landing pages online • Replacement for entire caching tier Why NoSQL and Couchbase • Low latency in sub-milliseconds with consistently high read / write throughput • Always-on operations even for database upgrades and maintenance with zero down time • memcached compatibility for easy migration to Couchbase without any application changes • High availability and disaster replication with intra-cluster and cross-cluster replication (XDCR)
  • 29. Use Case: Ad TargetingAd Targeting Types of Data Application Requirements • User profile: preferences • High performance to meet and psychographic data limited ad serving budget; time • Ad serving history by user allowance is typically <40 msec • Ad buying history by • Scalability to handle hundreds of advertiser millions of user profiles and rapidly growing amount of data • Ad serving history by advertiser • 24x7x365 availability to avoid ad revenue loss Why NoSQL and Couchbase • Sub-millisecond reads/writes means less time is needed for data access, more time is available for ad logic processing, and more highly optimized ads will be served • Ease of scalability ensures that the data cluster can be grown seamlessly as the amount of user and ad data grows • Always-on operations = always-on revenue. You will never miss the opportunity to serve an ad because downtime.
  • 30. Market Adoption – CustomersInternet Companies Enterprises
  • 31. www.couchbase.com/download
  • 32. Thank you! Get Couchbase Server 2.0http://www.couchbase.com/download dipti@couchbase.com @dborkar
  • 33. Q&A