SlideShare a Scribd company logo
1 of 17
Download to read offline
A Zoom on Membase
         Dedicated to VNG
                       Viet-Trung TRAN
        ENS Cachan, INRIA/IRISA France




1       www.trungtv.com   19/06/11
What’s Membase
  A      key/value store
         Simple, fast, elastic
  Membase’s API           is simple but not simpler
         SET(key, value)
         Value = GET(key)




 2                                         www.trungtv.com 19/06/11
Where’s Membase
  SQL          database? No
         No complex queries, no-schema, no ACID
  NoSQL
         Non-relational, distributed and HORIZONTALLY scalable
         Key/value store
              Dynamo, Membase,Voldemort, Riak, Redis, etc.
         Column-oriented store
              BigTable, Hbase, Cassandra, etc.
         Documents store
              MongoDB, CouchDB, Terrastore, etc.
         Array-oriented store
              Pyramid, SciDB

 3                                                www.trungtv.com 19/06/11
Why NoSQL
  For      over 40 years, mostly used RDMS
         So good but so COMPLEX
         Hard to SCALE
  2005: “One       size fits all”: An idea whose time has come and
   gone
  Called for “Scale OUT” design
         Cheap, easy
  Why       Membase
         Membase = So-called Memcached + persistent storage
         Membase = A Distributed caching system + persistent storage


 4                                       www.trungtv.com 19/06/11
Why Membase
  Membase = So-called Memcached + persistent storage
  Membase = A Distributed caching system + persistent
   storage
  Membase    speaking Memcached languages




 5                              www.trungtv.com 19/06/11
MEMBASE = SIMPLE, FAST, ELASTIC
  Simple
         2 primitives GET, SET (key, value)
  Fast
         Cost for I/O routing: O(1)
         Give me a key, I know exactly where to go
  Elastic
         Free scalle UP and DOWN
         Scale from 1 to thousands machines
         Fault-tolerance




 6                                             www.trungtv.com 19/06/11
Membase deployment




7                    www.trungtv.com 19/06/11
Data flow
  Map(Key, vbucket)
  Map(vbucket, node)




 8                      www.trungtv.com 19/06/11
Data flow [cont’]
  Internal   data flow + replication schema




 9                                  www.trungtv.com 19/06/11
Membase arch
  Symmetric   design: identical software on every nodes
      Data management
      Membership management




 10                                www.trungtv.com 19/06/11
Thinking on Membase

                             Personal view



11         www.trungtv.com    19/06/11
Membase’s design choices
  CAP    theorem: Pick 2 out of 3
      Consistency
      Availability
      Patition-tolerance
  Membase is CA
  Do we really need strong consistency ?




 12                                  www.trungtv.com 19/06/11
Strong consistency
  Pessimistic    replication may be costly
      A write is blocking until data is completely replicated
  1   single master node coordinates reads and writes
      Lower I/O performance in concurrency
  Synchronous      replication schema
      One replica failed, I/O failed
  Proposal: using     different consistency models depending on
  applications




 13                                      www.trungtv.com 19/06/11
Data migration & replication
  LRU  algorithm
  Replication factor is configurable per (key, value)?
      Vbucket
  Re-replication    in case of failure?
      “Anti-entropy” replica synchronisation?
  Proposal: Application-aware      migration is the best




 14                                     www.trungtv.com 19/06/11
Cluster management
  One    single node is elected as cluster leader
      Only running efficiently in single cluster environment
      High load on the leader at large-scale
  Rebalancing?
      Permanent failure vs temporary failure?
  “Node capacity-aware” load balancing?
  Heartbeat frequency should be well configured
      Depending on cluster size and network type
  Efficiency   of leader election algorithm?



 15                                      www.trungtv.com 19/06/11
Conclusion
  Pros
      In production for many companies
      Well known API
  Cons
      Not so well documented
      May be better in source code?
      Some key techniques should be well clarified
  One    size fit all has come and gone: Design patterns
      Application-aware
      Infrastructure-aware
      Human resource-aware

 16                                    www.trungtv.com 19/06/11
Thank you!




17           www.trungtv.com 19/06/11

More Related Content

Viewers also liked

Towards A Grid File System Based On A Large-Scale BLOB Management Service
Towards A Grid File System Based On A Large-Scale BLOB Management ServiceTowards A Grid File System Based On A Large-Scale BLOB Management Service
Towards A Grid File System Based On A Large-Scale BLOB Management ServiceViet-Trung TRAN
 
Tachyon memory centric, fault tolerance storage for cluster framworks
Tachyon  memory centric, fault tolerance storage for cluster framworksTachyon  memory centric, fault tolerance storage for cluster framworks
Tachyon memory centric, fault tolerance storage for cluster framworksViet-Trung TRAN
 
Scalable Data Management Systems for Big Data
Scalable Data Management Systems for Big DataScalable Data Management Systems for Big Data
Scalable Data Management Systems for Big DataViet-Trung TRAN
 
Efficient Support for MPI-I/O Atomicity
Efficient Support for MPI-I/O AtomicityEfficient Support for MPI-I/O Atomicity
Efficient Support for MPI-I/O AtomicityViet-Trung TRAN
 
Some thoughts on apache spark & shark
Some thoughts on apache spark & sharkSome thoughts on apache spark & shark
Some thoughts on apache spark & sharkViet-Trung TRAN
 
Pyramid: A large-scale array-oriented active storage system
Pyramid: A large-scale array-oriented active storage systemPyramid: A large-scale array-oriented active storage system
Pyramid: A large-scale array-oriented active storage systemViet-Trung TRAN
 
3 - Finding similar items
3 - Finding similar items3 - Finding similar items
3 - Finding similar itemsViet-Trung TRAN
 
OCR processing with deep learning: Apply to Vietnamese documents
OCR processing with deep learning: Apply to Vietnamese documents OCR processing with deep learning: Apply to Vietnamese documents
OCR processing with deep learning: Apply to Vietnamese documents Viet-Trung TRAN
 
Introduction to distributed file systems
Introduction to distributed file systemsIntroduction to distributed file systems
Introduction to distributed file systemsViet-Trung TRAN
 
Dimensionality reduction: SVD and its applications
Dimensionality reduction: SVD and its applicationsDimensionality reduction: SVD and its applications
Dimensionality reduction: SVD and its applicationsViet-Trung TRAN
 
Overview of big data in cloud computing
Overview of big data in cloud computingOverview of big data in cloud computing
Overview of big data in cloud computingViet-Trung TRAN
 
Introduction to BigData @TCTK2015
Introduction to BigData @TCTK2015Introduction to BigData @TCTK2015
Introduction to BigData @TCTK2015Viet-Trung TRAN
 
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...Viet-Trung TRAN
 
From neural networks to deep learning
From neural networks to deep learningFrom neural networks to deep learning
From neural networks to deep learningViet-Trung TRAN
 

Viewers also liked (16)

Towards A Grid File System Based On A Large-Scale BLOB Management Service
Towards A Grid File System Based On A Large-Scale BLOB Management ServiceTowards A Grid File System Based On A Large-Scale BLOB Management Service
Towards A Grid File System Based On A Large-Scale BLOB Management Service
 
Tachyon memory centric, fault tolerance storage for cluster framworks
Tachyon  memory centric, fault tolerance storage for cluster framworksTachyon  memory centric, fault tolerance storage for cluster framworks
Tachyon memory centric, fault tolerance storage for cluster framworks
 
Bitcoin P2P currency
Bitcoin P2P currencyBitcoin P2P currency
Bitcoin P2P currency
 
Scalable Data Management Systems for Big Data
Scalable Data Management Systems for Big DataScalable Data Management Systems for Big Data
Scalable Data Management Systems for Big Data
 
Efficient Support for MPI-I/O Atomicity
Efficient Support for MPI-I/O AtomicityEfficient Support for MPI-I/O Atomicity
Efficient Support for MPI-I/O Atomicity
 
Some thoughts on apache spark & shark
Some thoughts on apache spark & sharkSome thoughts on apache spark & shark
Some thoughts on apache spark & shark
 
Pyramid: A large-scale array-oriented active storage system
Pyramid: A large-scale array-oriented active storage systemPyramid: A large-scale array-oriented active storage system
Pyramid: A large-scale array-oriented active storage system
 
3 - Finding similar items
3 - Finding similar items3 - Finding similar items
3 - Finding similar items
 
6 clustering
6 clustering6 clustering
6 clustering
 
OCR processing with deep learning: Apply to Vietnamese documents
OCR processing with deep learning: Apply to Vietnamese documents OCR processing with deep learning: Apply to Vietnamese documents
OCR processing with deep learning: Apply to Vietnamese documents
 
Introduction to distributed file systems
Introduction to distributed file systemsIntroduction to distributed file systems
Introduction to distributed file systems
 
Dimensionality reduction: SVD and its applications
Dimensionality reduction: SVD and its applicationsDimensionality reduction: SVD and its applications
Dimensionality reduction: SVD and its applications
 
Overview of big data in cloud computing
Overview of big data in cloud computingOverview of big data in cloud computing
Overview of big data in cloud computing
 
Introduction to BigData @TCTK2015
Introduction to BigData @TCTK2015Introduction to BigData @TCTK2015
Introduction to BigData @TCTK2015
 
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
 
From neural networks to deep learning
From neural networks to deep learningFrom neural networks to deep learning
From neural networks to deep learning
 

Similar to A zoom on membase vng

Virtualization Changes Storage
Virtualization Changes StorageVirtualization Changes Storage
Virtualization Changes StorageStephen Foskett
 
Edge Chasing Delayed Consistency: Pushing the Limits of Weak Memory Models
Edge Chasing Delayed Consistency: Pushing the Limits of Weak Memory ModelsEdge Chasing Delayed Consistency: Pushing the Limits of Weak Memory Models
Edge Chasing Delayed Consistency: Pushing the Limits of Weak Memory Modelsracesworkshop
 
LAMP is so yesterday, MEAN is so tomorrow! :)
LAMP is so yesterday, MEAN is so tomorrow! :) LAMP is so yesterday, MEAN is so tomorrow! :)
LAMP is so yesterday, MEAN is so tomorrow! :) Sascha Sambale
 
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...Paul Hofmann
 
PostgreSQL Scaling And Failover
PostgreSQL Scaling And FailoverPostgreSQL Scaling And Failover
PostgreSQL Scaling And FailoverJohn Paulett
 
DB2 Data Sharing Performance for Beginners
DB2 Data Sharing Performance for BeginnersDB2 Data Sharing Performance for Beginners
DB2 Data Sharing Performance for BeginnersMartin Packer
 
Containerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptxContainerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptxRavi Yadav
 
Knowledge share about scalable application architecture
Knowledge share about scalable application architectureKnowledge share about scalable application architecture
Knowledge share about scalable application architectureAHM Pervej Kabir
 
RedisConf17 - Redis Enterprise on IBM Power Systems
RedisConf17 - Redis Enterprise on IBM Power SystemsRedisConf17 - Redis Enterprise on IBM Power Systems
RedisConf17 - Redis Enterprise on IBM Power SystemsRedis Labs
 
EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...
EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...
EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...{code}
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introductionScott Miao
 
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...MayaData
 
NoSQL Database
NoSQL DatabaseNoSQL Database
NoSQL DatabaseSteve Min
 
Dsmp Whitepaper V5
Dsmp Whitepaper V5Dsmp Whitepaper V5
Dsmp Whitepaper V5gelfstrom
 
Cat on demand emc vplex weakness
Cat on demand emc vplex weaknessCat on demand emc vplex weakness
Cat on demand emc vplex weaknessSahatma Siallagan
 
VLSID_2015_DSE_HMP_v3
VLSID_2015_DSE_HMP_v3VLSID_2015_DSE_HMP_v3
VLSID_2015_DSE_HMP_v3Santanu Sarma
 
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQLCompressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQLArseny Chernov
 
High Performance Php My Sql Scaling Techniques
High Performance Php My Sql Scaling TechniquesHigh Performance Php My Sql Scaling Techniques
High Performance Php My Sql Scaling TechniquesZendCon
 

Similar to A zoom on membase vng (20)

Virtualization Changes Storage
Virtualization Changes StorageVirtualization Changes Storage
Virtualization Changes Storage
 
Edge Chasing Delayed Consistency: Pushing the Limits of Weak Memory Models
Edge Chasing Delayed Consistency: Pushing the Limits of Weak Memory ModelsEdge Chasing Delayed Consistency: Pushing the Limits of Weak Memory Models
Edge Chasing Delayed Consistency: Pushing the Limits of Weak Memory Models
 
LAMP is so yesterday, MEAN is so tomorrow! :)
LAMP is so yesterday, MEAN is so tomorrow! :) LAMP is so yesterday, MEAN is so tomorrow! :)
LAMP is so yesterday, MEAN is so tomorrow! :)
 
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
 
PostgreSQL Scaling And Failover
PostgreSQL Scaling And FailoverPostgreSQL Scaling And Failover
PostgreSQL Scaling And Failover
 
NoSQL
NoSQLNoSQL
NoSQL
 
Ispn
IspnIspn
Ispn
 
DB2 Data Sharing Performance for Beginners
DB2 Data Sharing Performance for BeginnersDB2 Data Sharing Performance for Beginners
DB2 Data Sharing Performance for Beginners
 
Containerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptxContainerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptx
 
Knowledge share about scalable application architecture
Knowledge share about scalable application architectureKnowledge share about scalable application architecture
Knowledge share about scalable application architecture
 
RedisConf17 - Redis Enterprise on IBM Power Systems
RedisConf17 - Redis Enterprise on IBM Power SystemsRedisConf17 - Redis Enterprise on IBM Power Systems
RedisConf17 - Redis Enterprise on IBM Power Systems
 
EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...
EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...
EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
 
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
 
NoSQL Database
NoSQL DatabaseNoSQL Database
NoSQL Database
 
Dsmp Whitepaper V5
Dsmp Whitepaper V5Dsmp Whitepaper V5
Dsmp Whitepaper V5
 
Cat on demand emc vplex weakness
Cat on demand emc vplex weaknessCat on demand emc vplex weakness
Cat on demand emc vplex weakness
 
VLSID_2015_DSE_HMP_v3
VLSID_2015_DSE_HMP_v3VLSID_2015_DSE_HMP_v3
VLSID_2015_DSE_HMP_v3
 
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQLCompressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
 
High Performance Php My Sql Scaling Techniques
High Performance Php My Sql Scaling TechniquesHigh Performance Php My Sql Scaling Techniques
High Performance Php My Sql Scaling Techniques
 

More from Viet-Trung TRAN

Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017Viet-Trung TRAN
 
Dynamo: Amazon’s Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value StoreDynamo: Amazon’s Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value StoreViet-Trung TRAN
 
Pregel: Hệ thống xử lý đồ thị lớn
Pregel: Hệ thống xử lý đồ thị lớnPregel: Hệ thống xử lý đồ thị lớn
Pregel: Hệ thống xử lý đồ thị lớnViet-Trung TRAN
 
Mapreduce simplified-data-processing
Mapreduce simplified-data-processingMapreduce simplified-data-processing
Mapreduce simplified-data-processingViet-Trung TRAN
 
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của FacebookTìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của FacebookViet-Trung TRAN
 
giasan.vn real-estate analytics: a Vietnam case study
giasan.vn real-estate analytics: a Vietnam case studygiasan.vn real-estate analytics: a Vietnam case study
giasan.vn real-estate analytics: a Vietnam case studyViet-Trung TRAN
 
A Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkA Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkViet-Trung TRAN
 
A Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkA Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkViet-Trung TRAN
 
Large-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on SparkLarge-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on SparkViet-Trung TRAN
 
Recent progress on distributing deep learning
Recent progress on distributing deep learningRecent progress on distributing deep learning
Recent progress on distributing deep learningViet-Trung TRAN
 
success factors for project proposals
success factors for project proposalssuccess factors for project proposals
success factors for project proposalsViet-Trung TRAN
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forestsViet-Trung TRAN
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringViet-Trung TRAN
 
Introduction to mining massive datasets
Introduction to mining massive datasetsIntroduction to mining massive datasets
Introduction to mining massive datasetsViet-Trung TRAN
 
Interactive big data analytics
Interactive big data analyticsInteractive big data analytics
Interactive big data analyticsViet-Trung TRAN
 
Hệ thống phân tích tình trạng giao thông: Ứng dụng công cụ xử lý dữ liệu lớn...
Hệ thống phân tích tình trạng giao thông:  Ứng dụng công cụ xử lý dữ liệu lớn...Hệ thống phân tích tình trạng giao thông:  Ứng dụng công cụ xử lý dữ liệu lớn...
Hệ thống phân tích tình trạng giao thông: Ứng dụng công cụ xử lý dữ liệu lớn...Viet-Trung TRAN
 

More from Viet-Trung TRAN (20)

Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
 
Dynamo: Amazon’s Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value StoreDynamo: Amazon’s Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value Store
 
Pregel: Hệ thống xử lý đồ thị lớn
Pregel: Hệ thống xử lý đồ thị lớnPregel: Hệ thống xử lý đồ thị lớn
Pregel: Hệ thống xử lý đồ thị lớn
 
Mapreduce simplified-data-processing
Mapreduce simplified-data-processingMapreduce simplified-data-processing
Mapreduce simplified-data-processing
 
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của FacebookTìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
 
giasan.vn real-estate analytics: a Vietnam case study
giasan.vn real-estate analytics: a Vietnam case studygiasan.vn real-estate analytics: a Vietnam case study
giasan.vn real-estate analytics: a Vietnam case study
 
Giasan.vn @rstars
Giasan.vn @rstarsGiasan.vn @rstars
Giasan.vn @rstars
 
A Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkA Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural Network
 
A Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkA Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural Network
 
Large-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on SparkLarge-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on Spark
 
Recent progress on distributing deep learning
Recent progress on distributing deep learningRecent progress on distributing deep learning
Recent progress on distributing deep learning
 
success factors for project proposals
success factors for project proposalssuccess factors for project proposals
success factors for project proposals
 
GPSinsights poster
GPSinsights posterGPSinsights poster
GPSinsights poster
 
Deep learning for nlp
Deep learning for nlpDeep learning for nlp
Deep learning for nlp
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forests
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filtering
 
Introduction to mining massive datasets
Introduction to mining massive datasetsIntroduction to mining massive datasets
Introduction to mining massive datasets
 
2 association rules
2 association rules2 association rules
2 association rules
 
Interactive big data analytics
Interactive big data analyticsInteractive big data analytics
Interactive big data analytics
 
Hệ thống phân tích tình trạng giao thông: Ứng dụng công cụ xử lý dữ liệu lớn...
Hệ thống phân tích tình trạng giao thông:  Ứng dụng công cụ xử lý dữ liệu lớn...Hệ thống phân tích tình trạng giao thông:  Ứng dụng công cụ xử lý dữ liệu lớn...
Hệ thống phân tích tình trạng giao thông: Ứng dụng công cụ xử lý dữ liệu lớn...
 

Recently uploaded

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 

A zoom on membase vng

  • 1. A Zoom on Membase Dedicated to VNG Viet-Trung TRAN ENS Cachan, INRIA/IRISA France 1 www.trungtv.com 19/06/11
  • 2. What’s Membase   A key/value store   Simple, fast, elastic   Membase’s API is simple but not simpler   SET(key, value)   Value = GET(key) 2 www.trungtv.com 19/06/11
  • 3. Where’s Membase   SQL database? No   No complex queries, no-schema, no ACID   NoSQL   Non-relational, distributed and HORIZONTALLY scalable   Key/value store   Dynamo, Membase,Voldemort, Riak, Redis, etc.   Column-oriented store   BigTable, Hbase, Cassandra, etc.   Documents store   MongoDB, CouchDB, Terrastore, etc.   Array-oriented store   Pyramid, SciDB 3 www.trungtv.com 19/06/11
  • 4. Why NoSQL   For over 40 years, mostly used RDMS   So good but so COMPLEX   Hard to SCALE   2005: “One size fits all”: An idea whose time has come and gone   Called for “Scale OUT” design   Cheap, easy   Why Membase   Membase = So-called Memcached + persistent storage   Membase = A Distributed caching system + persistent storage 4 www.trungtv.com 19/06/11
  • 5. Why Membase   Membase = So-called Memcached + persistent storage   Membase = A Distributed caching system + persistent storage   Membase speaking Memcached languages 5 www.trungtv.com 19/06/11
  • 6. MEMBASE = SIMPLE, FAST, ELASTIC   Simple   2 primitives GET, SET (key, value)   Fast   Cost for I/O routing: O(1)   Give me a key, I know exactly where to go   Elastic   Free scalle UP and DOWN   Scale from 1 to thousands machines   Fault-tolerance 6 www.trungtv.com 19/06/11
  • 7. Membase deployment 7 www.trungtv.com 19/06/11
  • 8. Data flow   Map(Key, vbucket)   Map(vbucket, node) 8 www.trungtv.com 19/06/11
  • 9. Data flow [cont’]   Internal data flow + replication schema 9 www.trungtv.com 19/06/11
  • 10. Membase arch   Symmetric design: identical software on every nodes   Data management   Membership management 10 www.trungtv.com 19/06/11
  • 11. Thinking on Membase Personal view 11 www.trungtv.com 19/06/11
  • 12. Membase’s design choices   CAP theorem: Pick 2 out of 3   Consistency   Availability   Patition-tolerance   Membase is CA   Do we really need strong consistency ? 12 www.trungtv.com 19/06/11
  • 13. Strong consistency   Pessimistic replication may be costly   A write is blocking until data is completely replicated   1 single master node coordinates reads and writes   Lower I/O performance in concurrency   Synchronous replication schema   One replica failed, I/O failed   Proposal: using different consistency models depending on applications 13 www.trungtv.com 19/06/11
  • 14. Data migration & replication   LRU algorithm   Replication factor is configurable per (key, value)?   Vbucket   Re-replication in case of failure?   “Anti-entropy” replica synchronisation?   Proposal: Application-aware migration is the best 14 www.trungtv.com 19/06/11
  • 15. Cluster management   One single node is elected as cluster leader   Only running efficiently in single cluster environment   High load on the leader at large-scale   Rebalancing?   Permanent failure vs temporary failure?   “Node capacity-aware” load balancing?   Heartbeat frequency should be well configured   Depending on cluster size and network type   Efficiency of leader election algorithm? 15 www.trungtv.com 19/06/11
  • 16. Conclusion   Pros   In production for many companies   Well known API   Cons   Not so well documented   May be better in source code?   Some key techniques should be well clarified   One size fit all has come and gone: Design patterns   Application-aware   Infrastructure-aware   Human resource-aware 16 www.trungtv.com 19/06/11
  • 17. Thank you! 17 www.trungtv.com 19/06/11