NoSQL
研究⽣生 鍾聖彥
指導⽼老師 李永⼭山教授
NoSQL 意指不僅僅是 SQL
Introduction
•

NoSQL⼀一詞最早出現於1998年,是Carlo Strozzi開發的
⼀一個輕量、開放原始碼、不提供SQL功能的關係資料庫

•

2009年來⾃自Rackspace的Eric Evans再次提出了NoSQL...
NoSQL是⼀一個不⼩小⼼心被創⽴立的名詞
NoSQL資料庫的共通特性
•

不使⽤用關聯式模型

•

在叢集上可以良好執⾏行

•

開放原始碼

•

爲21世紀的網站所建⽴立

•

無綱要的

•

不使⽤用SQL作為查詢語⾔言
為什麼要⽤用 NoSQL?(1/2)
• 管理⼤大規模資料:NoSQL 資料庫能輕易處理⼤大量的讀寫
週期、眾多⽤用⼾戶,以及數以 petabytes 計的資料。
50

(petabytes =2

bytes; 1024 terabytes...
為什麼要⽤用 NoSQL?(2/2)
• 可⽤用性: 多數分散式 NoSQL 資料庫都提供簡易
的資料複製,單⼀一節點的毀損較不會影響資料的可⽤用
性。
• 延展性:NoSQL 資料庫不需要專⽤用的⾼高效能伺服
器。輕易地運⾏行在⼀一般硬體組成...
How did we get?
•

Explosion of social media sites (Google,Facebook,
Twitter) with large data needs

•

Rise of cloud-base...
Who influences?
Dynamo and BigTable
•

Three major papers were the seeds of
the NoSQL movement:

BigTable (Google)
Dynamo (Amazon)
•

CAP ...
CAP Theorem
•

⼜又被稱作 布魯爾定理(Brewer's theorem)

•

Brewer’s CAP “Theorem”: for any system sharing data it is
impossible to g...
CAP Theorem
!

•

Very large systems will partition at some point:

It is necessary to decide between C and A
!

•

Tradit...
NoSQL database
NoSQL 資料模型(1/2)
•

Key-value Stores:⼀一般只包含⼀一系列的全域鍵值對,每個值各
伴隨有獨特的鍵

•

Document stores:模型的著名實作包括 MongoDB、CouchDB、
RavenDB

...
!
•

NoSQL 資料模型(2/2)
Hierarchical :These databases store data in the form of
hierarchical relevance, that is, tree or pare...
Complexity
Key-value Stores
Extremely	
  simple	
  interface	
  

Data model: (key, value) pairs
Operations: Insert(key,value), Fetch...
Key-value Stores

!

•

Riak – Based on Amazon’s Dynamo.
Suitable or Not suitable
合適的使⽤用狀況

不適合使⽤用狀況

儲存網路通訊對話資訊

取得不同資料間的關係

使⽤用者喜好設定

多個鍵值操作

購物⾞車資料

⽤用資料來查詢
Document stores
Like Key-Value Stores except value is document
Data model: (key, document) pairs
Document: JSON, XML, othe...
Document stores
Suitable or Not suitable
合適的使⽤用狀況

不適合使⽤用狀況

事件歷史記錄

包含多種複雜交易

內容管理系統、部落格

在不同叢集結構上查詢

網路分析、

即時資料分析
Column-oriented
Based on JSON format: a data model which supports
lists, maps, dates, Boolean with nesting
Really: indexed...
Column-oriented
Column-oriented
Row oriented
Id

username

email

Department

1

John

john@foo.com

Sales

2

Mary

mary@foo.com

Marketi...
Suitable or Not suitable
合適的使⽤用狀況

不適合使⽤用狀況

事件歷史記錄

讀取或寫⼊入

ACID交易系統

內容管理系統、部落格
平台

開發初期階段

限期使⽤用(廣告推播)

查詢變更(COST)
Graph stores
Interfaces and query languages vary
Example systems: Neo4j, FlockDB, Pregel, …
RDF “triple stores” can map to...
Graph stores
Graph stores
Suitable or Not suitable
合適的使⽤用狀況

不適合使⽤用狀況

社群網路發佈

更新全部或實體⼦子集

轉發、傳遞基於位置的
服務
Recommendation
Conclusion and Discuss
NoSQL database cover only a part of data-intensive
cloud applications (mainly Web applications).
!
...
Conclusion and Discuss
!
!

Hybrid solutions:
Voldemort with MySQL as one of storage backend
deal with NoSQL data as semis...
Reference
http://www.ithome.com.tw/itadm/article.php?c=63360&s=5
http://www.openfoundry.org/index.php?
option=com_content&...
End
No sql
Upcoming SlideShare
Loading in …5
×

No sql

762 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
762
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

No sql

  1. 1. NoSQL 研究⽣生 鍾聖彥 指導⽼老師 李永⼭山教授
  2. 2. NoSQL 意指不僅僅是 SQL
  3. 3. Introduction • NoSQL⼀一詞最早出現於1998年,是Carlo Strozzi開發的 ⼀一個輕量、開放原始碼、不提供SQL功能的關係資料庫 • 2009年來⾃自Rackspace的Eric Evans再次提出了NoSQL 的概念,這時的NoSQL主要指⾮非關係型、分布式、不提供 ACID的資料庫設計模式。(為聚會取名⼦子>>>NoSQL) • NoSQL資料庫通常都是基於21世紀早期網站需求⽽而設計 • 聚焦在叢集上的巨量資料
  4. 4. NoSQL是⼀一個不⼩小⼼心被創⽴立的名詞
  5. 5. NoSQL資料庫的共通特性 • 不使⽤用關聯式模型 • 在叢集上可以良好執⾏行 • 開放原始碼 • 爲21世紀的網站所建⽴立 • 無綱要的 • 不使⽤用SQL作為查詢語⾔言
  6. 6. 為什麼要⽤用 NoSQL?(1/2) • 管理⼤大規模資料:NoSQL 資料庫能輕易處理⼤大量的讀寫 週期、眾多⽤用⼾戶,以及數以 petabytes 計的資料。 50 (petabytes =2 bytes; 1024 terabytes, or a million gigabytes.) • 不需要資料庫綱要 (Schema):當涉及到綱要建構時,它 們提供了相當廣泛的選擇空間,能輕易地和物件相對應。 • 開發者親和性:NoSQL 資料庫對各主要程式語⾔言提供了 簡單的 API,因此再也⽤用不著複雜的 ORM 框架。如果特定程式 語⾔言沒有 API 可⽤用時,還是可以透過簡單的 Restful API,使⽤用 XML 以及 JSON 格式經由 HTTP 存取資料。
  7. 7. 為什麼要⽤用 NoSQL?(2/2) • 可⽤用性: 多數分散式 NoSQL 資料庫都提供簡易 的資料複製,單⼀一節點的毀損較不會影響資料的可⽤用 性。 • 延展性:NoSQL 資料庫不需要專⽤用的⾼高效能伺服 器。輕易地運⾏行在⼀一般硬體組成的叢集上。 • 低延遲 Do not fully support relational features no join operations (except within partitions), no referential integrity constraints across partitions.
  8. 8. How did we get? • Explosion of social media sites (Google,Facebook, Twitter) with large data needs • Rise of cloud-based solutions such as Amazon S3 (simple storage solution) • Open-source community
  9. 9. Who influences?
  10. 10. Dynamo and BigTable • Three major papers were the seeds of the NoSQL movement: BigTable (Google) Dynamo (Amazon) • CAP Theorem
  11. 11. CAP Theorem • ⼜又被稱作 布魯爾定理(Brewer's theorem) • Brewer’s CAP “Theorem”: for any system sharing data it is impossible to guarantee simultaneously all of these three properties: • Consistency: all nodes see the same data at the same time • Availability: a guarantee that every request receives a response about whether it was successful or failed • Partition tolerance: the system continues to operate despite arbitrary message loss or failure of part of the system
  12. 12. CAP Theorem ! • Very large systems will partition at some point: It is necessary to decide between C and A ! • Traditional DBMS prefer C over A and P ! • Most Web applications choose A (except in specific applications such as order processing)
  13. 13. NoSQL database
  14. 14. NoSQL 資料模型(1/2) • Key-value Stores:⼀一般只包含⼀一系列的全域鍵值對,每個值各 伴隨有獨特的鍵 • Document stores:模型的著名實作包括 MongoDB、CouchDB、 RavenDB • Column-oriented:Google 針對內部使⽤用的 BigTable 分散式儲存系 統,發表研究論⽂文之後,以列為導向的資料庫知名的其他實作包括 Hadoop Hbase、Apache Cassandra、HyperTable • 圖形 (Graph):適合⽤用來記錄任何擁有複雜關係的資料,如社群網 路、產品偏好或是任何規則等。ex: Twitter ⽤用以實現⽤用⼾戶追蹤 (follow) 圖形的 FlockDB.
  15. 15. ! • NoSQL 資料模型(2/2) Hierarchical :These databases store data in the form of hierarchical relevance, that is, tree or parent-child relationship. ex:階層式資料庫著名的實作包括 Microsoft 的 Windows Registry 與 IBM 的 IMS 資料庫 • Triple stores:Triple stores save data in the form of subjectpredicate-object with the predicate being the linking factor between subject and object. Support Semantic Web and RDF Storage
  16. 16. Complexity
  17. 17. Key-value Stores Extremely  simple  interface   Data model: (key, value) pairs Operations: Insert(key,value), Fetch(key), Update(key), Delete(key) Implementa2on:  efficiency,  scalability,  fault-­‐tolerance   • • • • Records distributed to nodes based on key Replication Single-record transactions, “eventual consistency” Riak – Based on Amazon’s Dynamo. Example  systems   Google BigTable, Amazon Dynamo, Cassandra, Voldemort, HBase, …
  18. 18. Key-value Stores ! • Riak – Based on Amazon’s Dynamo.
  19. 19. Suitable or Not suitable 合適的使⽤用狀況 不適合使⽤用狀況 儲存網路通訊對話資訊 取得不同資料間的關係 使⽤用者喜好設定 多個鍵值操作 購物⾞車資料 ⽤用資料來查詢
  20. 20. Document stores Like Key-Value Stores except value is document Data model: (key, document) pairs Document: JSON, XML, other semistructured formats Basic operations: Insert(key,document), Fetch(key), Update(key), Delete(key) Also Fetch based on document contents. Example systems CouchDB, MongoDB, SimpleDB, …
  21. 21. Document stores
  22. 22. Suitable or Not suitable 合適的使⽤用狀況 不適合使⽤用狀況 事件歷史記錄 包含多種複雜交易 內容管理系統、部落格 在不同叢集結構上查詢 網路分析、 即時資料分析
  23. 23. Column-oriented Based on JSON format: a data model which supports lists, maps, dates, Boolean with nesting Really: indexed semistructured documents ! Example: Mongo { Name:"Jaroslav", Address:"Malostranske nám. 25, 118 00 Praha 1“ Grandchildren: [Claire: "7", Barbara: "6", "Magda: "3", "Kirsten: "1", "Otis: "3", Richard: "1"] }
  24. 24. Column-oriented
  25. 25. Column-oriented Row oriented Id username email Department 1 John john@foo.com Sales 2 Mary mary@foo.com Marketing 3 Yoda yoda@foo.com IT Column oriented Id Username email Department 1 John john@foo.com Sales 2 Mary mary@foo.com Marketing 3 Yoda yoda@foo.com IT
  26. 26. Suitable or Not suitable 合適的使⽤用狀況 不適合使⽤用狀況 事件歷史記錄 讀取或寫⼊入 ACID交易系統 內容管理系統、部落格 平台 開發初期階段 限期使⽤用(廣告推播) 查詢變更(COST)
  27. 27. Graph stores Interfaces and query languages vary Example systems: Neo4j, FlockDB, Pregel, … RDF “triple stores” can map to graph databases • • • • • • Data model: nodes and edges Nodes may have properties (including ID) Edges may have labels or roles
  28. 28. Graph stores
  29. 29. Graph stores
  30. 30. Suitable or Not suitable 合適的使⽤用狀況 不適合使⽤用狀況 社群網路發佈 更新全部或實體⼦子集 轉發、傳遞基於位置的 服務 Recommendation
  31. 31. Conclusion and Discuss NoSQL database cover only a part of data-intensive cloud applications (mainly Web applications). ! Problems with cloud computing: • SaaS applications require enterprise-level functionality, including ACID transactions, security, and other features associated with commercial RDBMS technology, i.e. ! • ! ! ! NoSQL should not be the only option in the cloud.
  32. 32. Conclusion and Discuss ! ! Hybrid solutions: Voldemort with MySQL as one of storage backend deal with NoSQL data as semistructured data integrating RDBMS and NoSQL via SQL/XML !
  33. 33. Reference http://www.ithome.com.tw/itadm/article.php?c=63360&s=5 http://www.openfoundry.org/index.php? option=com_content&task=view&id=9040&Itemid=4;isletter=1 http://www.julianbrowne.com/article/viewer/brewers-cap-theorem http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.1495  Book:搞懂NoSQL的15堂課(Pramod J. Sadalage、 Martin Fowler)
  34. 34. End

×