Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
NoSQL
研究⽣生 鍾聖彥
指導⽼老師 李永⼭山教授
NoSQL 意指不僅僅是 SQL
Introduction
•

NoSQL⼀一詞最早出現於1998年,是Carlo Strozzi開發的
⼀一個輕量、開放原始碼、不提供SQL功能的關係資料庫

•

2009年來⾃自Rackspace的Eric Evans再次提出了NoSQL...
NoSQL是⼀一個不⼩小⼼心被創⽴立的名詞
NoSQL資料庫的共通特性
•

不使⽤用關聯式模型

•

在叢集上可以良好執⾏行

•

開放原始碼

•

爲21世紀的網站所建⽴立

•

無綱要的

•

不使⽤用SQL作為查詢語⾔言
為什麼要⽤用 NoSQL?(1/2)
• 管理⼤大規模資料:NoSQL 資料庫能輕易處理⼤大量的讀寫
週期、眾多⽤用⼾戶,以及數以 petabytes 計的資料。
50

(petabytes =2

bytes; 1024 terabytes...
為什麼要⽤用 NoSQL?(2/2)
• 可⽤用性: 多數分散式 NoSQL 資料庫都提供簡易
的資料複製,單⼀一節點的毀損較不會影響資料的可⽤用
性。
• 延展性:NoSQL 資料庫不需要專⽤用的⾼高效能伺服
器。輕易地運⾏行在⼀一般硬體組成...
How did we get?
•

Explosion of social media sites (Google,Facebook,
Twitter) with large data needs

•

Rise of cloud-base...
Who influences?
Dynamo and BigTable
•

Three major papers were the seeds of
the NoSQL movement:

BigTable (Google)
Dynamo (Amazon)
•

CAP ...
CAP Theorem
•

⼜又被稱作 布魯爾定理(Brewer's theorem)

•

Brewer’s CAP “Theorem”: for any system sharing data it is
impossible to g...
CAP Theorem
!

•

Very large systems will partition at some point:

It is necessary to decide between C and A
!

•

Tradit...
NoSQL database
NoSQL 資料模型(1/2)
•

Key-value Stores:⼀一般只包含⼀一系列的全域鍵值對,每個值各
伴隨有獨特的鍵

•

Document stores:模型的著名實作包括 MongoDB、CouchDB、
RavenDB

...
!
•

NoSQL 資料模型(2/2)
Hierarchical :These databases store data in the form of
hierarchical relevance, that is, tree or pare...
Complexity
Key-value Stores
Extremely	
  simple	
  interface	
  

Data model: (key, value) pairs
Operations: Insert(key,value), Fetch...
Key-value Stores

!

•

Riak – Based on Amazon’s Dynamo.
Suitable or Not suitable
合適的使⽤用狀況

不適合使⽤用狀況

儲存網路通訊對話資訊

取得不同資料間的關係

使⽤用者喜好設定

多個鍵值操作

購物⾞車資料

⽤用資料來查詢
Document stores
Like Key-Value Stores except value is document
Data model: (key, document) pairs
Document: JSON, XML, othe...
Document stores
Suitable or Not suitable
合適的使⽤用狀況

不適合使⽤用狀況

事件歷史記錄

包含多種複雜交易

內容管理系統、部落格

在不同叢集結構上查詢

網路分析、

即時資料分析
Column-oriented
Based on JSON format: a data model which supports
lists, maps, dates, Boolean with nesting
Really: indexed...
Column-oriented
Column-oriented
Row oriented
Id

username

email

Department

1

John

john@foo.com

Sales

2

Mary

mary@foo.com

Marketi...
Suitable or Not suitable
合適的使⽤用狀況

不適合使⽤用狀況

事件歷史記錄

讀取或寫⼊入

ACID交易系統

內容管理系統、部落格
平台

開發初期階段

限期使⽤用(廣告推播)

查詢變更(COST)
Graph stores
Interfaces and query languages vary
Example systems: Neo4j, FlockDB, Pregel, …
RDF “triple stores” can map to...
Graph stores
Graph stores
Suitable or Not suitable
合適的使⽤用狀況

不適合使⽤用狀況

社群網路發佈

更新全部或實體⼦子集

轉發、傳遞基於位置的
服務
Recommendation
Conclusion and Discuss
NoSQL database cover only a part of data-intensive
cloud applications (mainly Web applications).
!
...
Conclusion and Discuss
!
!

Hybrid solutions:
Voldemort with MySQL as one of storage backend
deal with NoSQL data as semis...
Reference
http://www.ithome.com.tw/itadm/article.php?c=63360&s=5
http://www.openfoundry.org/index.php?
option=com_content&...
End
No sql
Upcoming SlideShare
Loading in …5
×

No sql

792 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No sql

  1. 1. NoSQL 研究⽣生 鍾聖彥 指導⽼老師 李永⼭山教授
  2. 2. NoSQL 意指不僅僅是 SQL
  3. 3. Introduction • NoSQL⼀一詞最早出現於1998年,是Carlo Strozzi開發的 ⼀一個輕量、開放原始碼、不提供SQL功能的關係資料庫 • 2009年來⾃自Rackspace的Eric Evans再次提出了NoSQL 的概念,這時的NoSQL主要指⾮非關係型、分布式、不提供 ACID的資料庫設計模式。(為聚會取名⼦子>>>NoSQL) • NoSQL資料庫通常都是基於21世紀早期網站需求⽽而設計 • 聚焦在叢集上的巨量資料
  4. 4. NoSQL是⼀一個不⼩小⼼心被創⽴立的名詞
  5. 5. NoSQL資料庫的共通特性 • 不使⽤用關聯式模型 • 在叢集上可以良好執⾏行 • 開放原始碼 • 爲21世紀的網站所建⽴立 • 無綱要的 • 不使⽤用SQL作為查詢語⾔言
  6. 6. 為什麼要⽤用 NoSQL?(1/2) • 管理⼤大規模資料:NoSQL 資料庫能輕易處理⼤大量的讀寫 週期、眾多⽤用⼾戶,以及數以 petabytes 計的資料。 50 (petabytes =2 bytes; 1024 terabytes, or a million gigabytes.) • 不需要資料庫綱要 (Schema):當涉及到綱要建構時,它 們提供了相當廣泛的選擇空間,能輕易地和物件相對應。 • 開發者親和性:NoSQL 資料庫對各主要程式語⾔言提供了 簡單的 API,因此再也⽤用不著複雜的 ORM 框架。如果特定程式 語⾔言沒有 API 可⽤用時,還是可以透過簡單的 Restful API,使⽤用 XML 以及 JSON 格式經由 HTTP 存取資料。
  7. 7. 為什麼要⽤用 NoSQL?(2/2) • 可⽤用性: 多數分散式 NoSQL 資料庫都提供簡易 的資料複製,單⼀一節點的毀損較不會影響資料的可⽤用 性。 • 延展性:NoSQL 資料庫不需要專⽤用的⾼高效能伺服 器。輕易地運⾏行在⼀一般硬體組成的叢集上。 • 低延遲 Do not fully support relational features no join operations (except within partitions), no referential integrity constraints across partitions.
  8. 8. How did we get? • Explosion of social media sites (Google,Facebook, Twitter) with large data needs • Rise of cloud-based solutions such as Amazon S3 (simple storage solution) • Open-source community
  9. 9. Who influences?
  10. 10. Dynamo and BigTable • Three major papers were the seeds of the NoSQL movement: BigTable (Google) Dynamo (Amazon) • CAP Theorem
  11. 11. CAP Theorem • ⼜又被稱作 布魯爾定理(Brewer's theorem) • Brewer’s CAP “Theorem”: for any system sharing data it is impossible to guarantee simultaneously all of these three properties: • Consistency: all nodes see the same data at the same time • Availability: a guarantee that every request receives a response about whether it was successful or failed • Partition tolerance: the system continues to operate despite arbitrary message loss or failure of part of the system
  12. 12. CAP Theorem ! • Very large systems will partition at some point: It is necessary to decide between C and A ! • Traditional DBMS prefer C over A and P ! • Most Web applications choose A (except in specific applications such as order processing)
  13. 13. NoSQL database
  14. 14. NoSQL 資料模型(1/2) • Key-value Stores:⼀一般只包含⼀一系列的全域鍵值對,每個值各 伴隨有獨特的鍵 • Document stores:模型的著名實作包括 MongoDB、CouchDB、 RavenDB • Column-oriented:Google 針對內部使⽤用的 BigTable 分散式儲存系 統,發表研究論⽂文之後,以列為導向的資料庫知名的其他實作包括 Hadoop Hbase、Apache Cassandra、HyperTable • 圖形 (Graph):適合⽤用來記錄任何擁有複雜關係的資料,如社群網 路、產品偏好或是任何規則等。ex: Twitter ⽤用以實現⽤用⼾戶追蹤 (follow) 圖形的 FlockDB.
  15. 15. ! • NoSQL 資料模型(2/2) Hierarchical :These databases store data in the form of hierarchical relevance, that is, tree or parent-child relationship. ex:階層式資料庫著名的實作包括 Microsoft 的 Windows Registry 與 IBM 的 IMS 資料庫 • Triple stores:Triple stores save data in the form of subjectpredicate-object with the predicate being the linking factor between subject and object. Support Semantic Web and RDF Storage
  16. 16. Complexity
  17. 17. Key-value Stores Extremely  simple  interface   Data model: (key, value) pairs Operations: Insert(key,value), Fetch(key), Update(key), Delete(key) Implementa2on:  efficiency,  scalability,  fault-­‐tolerance   • • • • Records distributed to nodes based on key Replication Single-record transactions, “eventual consistency” Riak – Based on Amazon’s Dynamo. Example  systems   Google BigTable, Amazon Dynamo, Cassandra, Voldemort, HBase, …
  18. 18. Key-value Stores ! • Riak – Based on Amazon’s Dynamo.
  19. 19. Suitable or Not suitable 合適的使⽤用狀況 不適合使⽤用狀況 儲存網路通訊對話資訊 取得不同資料間的關係 使⽤用者喜好設定 多個鍵值操作 購物⾞車資料 ⽤用資料來查詢
  20. 20. Document stores Like Key-Value Stores except value is document Data model: (key, document) pairs Document: JSON, XML, other semistructured formats Basic operations: Insert(key,document), Fetch(key), Update(key), Delete(key) Also Fetch based on document contents. Example systems CouchDB, MongoDB, SimpleDB, …
  21. 21. Document stores
  22. 22. Suitable or Not suitable 合適的使⽤用狀況 不適合使⽤用狀況 事件歷史記錄 包含多種複雜交易 內容管理系統、部落格 在不同叢集結構上查詢 網路分析、 即時資料分析
  23. 23. Column-oriented Based on JSON format: a data model which supports lists, maps, dates, Boolean with nesting Really: indexed semistructured documents ! Example: Mongo { Name:"Jaroslav", Address:"Malostranske nám. 25, 118 00 Praha 1“ Grandchildren: [Claire: "7", Barbara: "6", "Magda: "3", "Kirsten: "1", "Otis: "3", Richard: "1"] }
  24. 24. Column-oriented
  25. 25. Column-oriented Row oriented Id username email Department 1 John john@foo.com Sales 2 Mary mary@foo.com Marketing 3 Yoda yoda@foo.com IT Column oriented Id Username email Department 1 John john@foo.com Sales 2 Mary mary@foo.com Marketing 3 Yoda yoda@foo.com IT
  26. 26. Suitable or Not suitable 合適的使⽤用狀況 不適合使⽤用狀況 事件歷史記錄 讀取或寫⼊入 ACID交易系統 內容管理系統、部落格 平台 開發初期階段 限期使⽤用(廣告推播) 查詢變更(COST)
  27. 27. Graph stores Interfaces and query languages vary Example systems: Neo4j, FlockDB, Pregel, … RDF “triple stores” can map to graph databases • • • • • • Data model: nodes and edges Nodes may have properties (including ID) Edges may have labels or roles
  28. 28. Graph stores
  29. 29. Graph stores
  30. 30. Suitable or Not suitable 合適的使⽤用狀況 不適合使⽤用狀況 社群網路發佈 更新全部或實體⼦子集 轉發、傳遞基於位置的 服務 Recommendation
  31. 31. Conclusion and Discuss NoSQL database cover only a part of data-intensive cloud applications (mainly Web applications). ! Problems with cloud computing: • SaaS applications require enterprise-level functionality, including ACID transactions, security, and other features associated with commercial RDBMS technology, i.e. ! • ! ! ! NoSQL should not be the only option in the cloud.
  32. 32. Conclusion and Discuss ! ! Hybrid solutions: Voldemort with MySQL as one of storage backend deal with NoSQL data as semistructured data integrating RDBMS and NoSQL via SQL/XML !
  33. 33. Reference http://www.ithome.com.tw/itadm/article.php?c=63360&s=5 http://www.openfoundry.org/index.php? option=com_content&task=view&id=9040&Itemid=4;isletter=1 http://www.julianbrowne.com/article/viewer/brewers-cap-theorem http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.1495  Book:搞懂NoSQL的15堂課(Pramod J. Sadalage、 Martin Fowler)
  34. 34. End

×