No Sql On Social And Sematic Web


Published on

In this paper we describe NoSQL, a series of non-relational database technologies and products developed to address the current problems the RDMS system are facing: lack of true scalability, poor performance on high data volumes and low availability. Some of these products have already been involved in production and they perform very well: Amazon’s Dynamo, Google’s Bigtable, Cassandra, etc. Also we provide a view on how these systems influence the applications development in the social and semantic Web sphere.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

No Sql On Social And Sematic Web

  1. 1. NoSQL initiative and its influences on social and semantic Web Stefan Prutianu, Stefan Ceriu Faculty of Computer Science, „Al. I. Cuza“ University, Iasi, Romania { stefan.prutianu, stefan.ceriu} Abstract. In this paper we describe NoSQL, a series of non-relational database technologies and products developed to address the current problems the RDMS system are facing: lack of true scalability, poor performance on high data volumes and low availability. Some of these products have already been involved in production and they perform very well: Amazon’s Dynamo, Google’s Bigtable, Cassandra, etc. Also we provide a view on how these systems influence the applications development in the social and semantic Web sphere. Keywords: NoSQL, distributed computing, distributed non-relational database, semantic Web, social Web, scalability 1 Introduction Modern relational database technologies tend to have serious problems when it comes to managing huge volumes of data (eBay - 2PB of data overall [2]) as they are today and these problems are: scalability, performance and rigid schema design.[1] Vertical scaling (increasing the computational power of a single node) is just a temporary solution until the data grows again beyond the storage limit. Horizontal scaling in traditional relational database management system (partitioning, sharding) means dividing the data into multiple databases according to some application-specific boundaries, but splitting the data across multiple servers breaks the relationships stored within the database, the most valuable property of a relational database and it is also not transparent to the application’s business logic. Read slaves is a form of horizontal scaling used in RDMS (Relational Database Management System) where a read-only slave database is replicating the master database so every write is redirected to the master database and every read to one of the read slave replicas, but it is still not true scaling because of single failure point. Large relational databases (multi terabytes or petabytes in size) usually perform slowly on complex queries because of the amount of data they have to scan and because these systems design is disk-oriented and disk operations are time consuming. [3] RDMS requires that the database schema be designed before starting using the data (tables, columns, relationships) and in most cases such a schema will require changes
  2. 2. (adding new features, adjusting or fine tuning some other features) but changing the database schema is very hard in such systems (updating rows may lock them and it is a very time consuming operation). [12] NoSQL is the common name under a set of new technologies, design practices and open-source developed projects which address the problems that large scale distributed applications and platforms are facing: scalability, availability, performance, fault tolerance.The NoSQL trend is not intended to replace the relational database model; instead it proposes new solutions to problems that the traditional database model cannot solve. This paper is structured as follows. Section 2 describes the NoSQL trend in detail with its proposed solutions and results, Section 3 presents how NoSQL influenced the application development in Social and Semantic Web sphere and Section 4 concludes our survey. 2 NoSQL 2.1 Overview NoSQL proponents started to manifest more seriously in early 2009 when they proposed solutions of distributed databases that can be used in systems where the relational features present in RDMS are not needed. The inspiration points for these were the closed-source distributed databases already available in some large corporations such as: Dynamo from Amazon and Bigtable from Google. These solutions along with the open-source projects (Cassandra, Hypertable, HBase, Redis) share a number of characteristics: key-value storage, run on a large number of machines, data are partitioned and shared among these machines. Another common characteristic of these is that in order to get the level of scalability, availability, performance and fault (partition) tolerance desired the data consistency requirement is relaxed and this is because of the Eric Brewer’s CAP Theorem which proves that in a distributed environment you cannot get Consistency, Availability and Partition Tolerance at the same time [6] so most of these system achieve a particular form of weak consistency named eventual consistency. Consistency means that a system operates fully or not at all; in a distributed environment if an update is made to some node, all its replicas are updated until any read from those replicas are performed. Consistency can be achieved by using relational databases because they focus on ACID (Atomicity, Consistency, Isolation, Durability) properties. Availability means that a system is always available to perform requested tasks. Partition Tolerance is the ability of a distributed system to work even in case of partition forming – one or more nodes are isolated from the others due to network/communication failures.
  3. 3. Eventual consistency is a specific form of weak consistency; if no new updates are made to the object, eventually all reads will return the updated value. [7] DNS (Domain Name System) is a system that implements eventual consistency. Dynamo. Amazon’s Dynamo is a highly available key-value structured storage system[4]. It was developed to meet Amazon’s needs for reliability and scaling. Access to data is provided through a primary-key interface (get(key), put(key) and overloads of these operations), scalability and availability are achieved through a combination of techniques: consistent hashing for data partitioning and replication, data consistency is facilitated by object versioning, consistency among replicas during updated uses a quorum technique and decentralized replica synchronization protocol and for failure detection and membership updates a gossip-based protocol is used. Amazon’s engineers motivated their choice when implementing this system by the fact that most of the services their platform exposes store and retrieve data by a primary key thus not requiring the complex querying and management functionality within a RDMS, the cost of maintenance for a RDMS and also using traditional storage models the availability would be sacrificed in favor of consistency. Dynamo components for request coordination, membership and failure detection and local persistence engine are all implemented in Java. Local persistence component has a pluggable design and uses engines like: BDB (Berkeley Database) Transactional Data Store, BDB Java Edition, MySQL and in-memory buffer with persistent backing storage.[4] Bigtable. Google’s Bigtable is a distributed storage system for managing structured data designed to be highly scalable. This system has proven its efficiency in important applications from Google: Personalized Search, Google Analytics, Google Earth, Google Finance. Bigtable does not support a full relational data model; instead it provides clients with a simple data model indexed using row, columns and timestamps. From the data model point of view Bigtable is a sparse, distributed persistent, multi-dimensional sorted map where each value in the map is an uninterpreted array of bytes. Row keys are arbitrary strings and data in Bigtable is maintained in lexicographic order by row keys and every read/write under a row key is considered an atomic operation regardless the number of columns involved. Columns are grouped in sets called column families and usually these contain information of the same type. Timestamps are introduced because each cell can contain multiple versions of the same data. Bigtable API provides functions for: creating and deleting tables and column families, reads/updates under a particular key and other operations involving cluster management. A master model is use to manage load balancing and fault tolerance. For internal persistence Bigtable uses SSTable (immutable sorted file of key-value pairs) file format in conjunction with GFS (Google File System). [5]
  4. 4. 2.2 Design patterns [8], [11] API Model. Because the underlying data model can be considered as a large distributed hashtable (DHT) the basic API (Application Programming Interface) could be: - get(key) – extract the value at the given key . - put(key) – updates the values at the given key . - delete(key) – removes the key and its associated value . Machine Infrastructure. The infrastructure for these kind of systems is composed of a large number of machines with commodity hardware connected together through a network. Each machine (physical node) has the same software configuration, but the hardware characteristics may not be the same. Within each physical node there are a number of virtual nodes running. Partition Schemes. Most large scale distributed system uses a consistent hashing technique due to its flexibility when the number of virtual nodes is altered. When nodes are added or removed keys and data need to be redistributed and a consistent hashing technique minimizes the amount of these changes. In the consistent hashing technique the key space is finite; the output range of a hash function is treated as a fixed ring. Both virtual node ids and data items keys take values in this circular space and the owner of a set of keys identifying data items is considered as the first virtual node encountered walking the ring clockwise from that key. In case of virtual nodes crashes all the keys owned by the failing node will be adopted by its clockwise neighbor thus the rest of virtual nodes on the ring are not affected. Data Replication. In order to achieve high availability and performance same data need to be available on multiple nodes – replicas. In Dynamo [4] the list of nodes responsible for storing a particular key is called preference list and the size of this list is configured by a preset parameter. While read actions can be performed on any replica, update actions can lead to some consistency issues because the updates need to be propagated to all the replicas. Data Models. The basic data access method is to use a key in order to retrieve or update a value. Value can be: blob (binary large object) [4], document, column family (rows and columns, but the rows can have as few or as many columns as desired) [5], graph or collection. Storage Models. The most used strategy is to design this component in a pluggable fashion where storage mechanisms can be: MySQL DB, Berkeley DB, filesystem - SSTables, or in-memory storage – memtables. Consistency Management. The same data is available on multiple nodes at a given time and the problem that arises is to synchronize these replicas in order to preserve a consistent view of data from the client perspective. In such systems where availability and partition tolerance are an important requirement strict consistency cannot be achieved at the same time with first two properties (CAP Theorem) thus a form of
  5. 5. weak consistency – eventual consistency is implemented in these systems. There are various mechanism that will guarantee such systems will eventually become consistent after a period of time (inconsistency window) during which synchronization is performed. Timestamps. Using the history of operations performed on a row of data can be decided to what value the row will eventually converge to. The drawbacks of this method are: requires synchronized clocks on nodes, don’t capture causality, a decision is hard to take when write operations happened simultaneously. Vector clocks. A vector clock is a tuple {t1, t2,…,tn} of clock values from each node. When a write operation is performed on node i it sets ti to its clock value. Given two vector clocks v1 and v2, v1 < v2 (if for all k v1[k] ≤ v2[k]) implies the global time ordering of events. There are certain rules that replicas follow when updating their vector clock: - when an internal operation happens at replica i it will advance its vector clock vi[i] - when replica i sends a message to replica j it also attaches its vector clock to the message - when replica j receives a message from replica i it will advance its clock vj[j] and then merge it with the vector clock received in the message vj[k] = max(vi[k], vi[k]) Single Master Model. In this model each data partition has a master node and multiple slave nodes. Updates are redirected to the master node and then, asynchronously, the update propagates to the slave nodes. Sometimes using this model a system can become unavailable if the master has failed and none of the replicas have been updated yet. Multi-Master Model. In certain key ranges intensive requests for updates will cause the Single Master Model to be unable to spread the workload correctly. Multi-Master Model allows updates to be performed at any replicas. Quorum Based 2PC. Assuming that there are N replicas of some data and a coordinator node, when an update is requested the coordinator sends the request to all the N replicas but it has to wait for only W (W < N) successful answers. The same happens in read actions, the coordinator sends the request to the N replicas, but has to wait only for R (R < N) successful responses and from all the answering nodes the one with the highest timestamp is selected. This protocol is flexible because configuring the W and R values accordingly different levels of consistency can be achieved: W+R>N – strict consistency, W+R ≤ N - the model of consistency is relaxed to a weaker one. Membership Management. Since nodes in a cluster may fail or recover the need for a technique that will allow nodes to know about each other arises. Omniscient Master. When nodes leave or join a cluster they communicate with a master node that holds the authoritative view of the cluster. This method is simple and provides a consistent view of cluster status, but these is still a single point of failure and the model is not partition tolerant. Gossip. This is a method to propagate cluster status to all the members. Every preset amount of time a node selects another to communicate its view about the cluster with. Every node maintains a timestamp of the information about itself and the rest of the
  6. 6. cluster. This method is scalable and failure tolerant but provides eventual consistency about cluster status. 2.3 Open-Source Projects Dynamo [4] and Bigtable [5] constituted a great starting point for developing open- source, non-relational, distributed and horizontal scalable databases. NoSQL movement began in early 2009 and grows rapidly into a consistent list of free and competitive products providing most of necessary properties in distributed systems: schema-free, replication support, easy API, eventual consistency, performance. Bellow is presented a non-exhaustive list of current databases and their classifications along three important characteristics: scalability, data and query model, internal persistence model. Scalability Data and Query Model Persistence Model Add new Support Data Query API machines for multiple Model transparently datacenters to applications Cassandra Column Thrift Memtable/ family SStable HBase Column Thrift, REST Memtable/ family SStable on HDFS Riak Document Nested hashes ? Scalaris Key/value get/put in-memory only Voldemort under Key/value get/put BDB, MySQL development CouchDB Document map/reduce views append-only B- Tree MongoDB Document Cursor B-Tree Neo4j Graph Graph on-disk linked lists Redis Collection Collection in-memory Tokyo Key/value get/put hash or B-Tree Cabinet Chordless Key/value Java, simple RPC ?
  7. 7. Add new Support Data Query API Persistence machines for multiple Model Model transparently datacenters to applications InfoGrid Graph Java, http/REST ? Sones Graph .Net ? Table. 1. Classification by scalability, data and query model and persistence model [1], [13] This table summarizes the most important characteristics of a subset from non- relational database systems currently available. The rest of this section will focus on describing some of these databases. Cassandra. This system development started at Facebook and one of its designers was a co-author of Dynamo. At the moment the project is open source and still under “heavy development” at The Apache Software Foundation. Their authors define it as a “structured storage system over a P2P network”. [11] This system combines the distributed architecture of Dynamo and the column family model from Bigtable. From the data model point of view Cassandra it is a multi-dimensional map indexed by a key where each application creates its own key space. Besides column family a new concept of super columns is introduced which represents lists of columns. Data is sorted at write operations and also within a row columns are sorted by their name. Partitioning subsystem is similar to Dynamo approach - consistent hashing is used. The same concepts of coordinator node and preference list as in Dynamo are used for data replication. Cluster management uses a variant of Gossip technique – Scuttlebutt anti-entropy Gossip. Internal persistence relies on the local file system and storage structure is similar to the one in Bigtable: SSTable, memtable, commit logs, compaction and Bloom filters. The system is written in Java and high level libraries are available for: Ruby, Perl, Python, Scala. Facebook, Digg and Rackspace use this system in production. [11], [12] Voldemort. Key-value store systems developed by Linkenin engineers implements most of the features available in Dynamo: partition and replication (consistent hashing, preference list), object versioning (vector clocks), pluggable storage component (BDB, in-memory, MySQL). Voldemort also comes with a series of new features: serialization, support for read-only nodes, compression. Linkedin uses this system as its underlying storage system. [11], [12] Riak. Key-value store system that uses documents as values, using the same architecture and algorithms as Dynamo. Implementation is done in Erlang and various client libraries are available: Jiak Client (Erlang (JSON)), Riak (Erlang (raw)), Pyhton, PHP, Ruby, Java, JavaScript. There are no known examples of usages in production. [12] Redis. Key-value store where values can have multiple types: strings, lists, sets, ordered sets. Replication is achieved via a Master – Slave model, client libraries
  8. 8. (available in PHP, Ruby, Scala) are responsible for partitioning. It uses a memory- driven approach with asynchronously snapshots to disk for local persistence. Some other supported operations depend on the values data types: increments, decrements, atomic multi-set (Strings); push, pop , range get (Lists); intersection, union, difference (Sets), sorting. It is written in ANSI C and it is used in production at: Github, Engine Yard, VideoWiki. [12] Neo4j. This is a disk-based (data is stores in a custom binary format), fully transactional Java persistence engine that stores data structures in graphs. Some of its most important features are: graph-oriented mode for data representation (stores, nodes, relationships and properties), high scalability (both across the same machine but also on multiple machines), OO simple Java API, optional layers to expose itself as a RDS Store, express meta model semantics using OWL, query the graph using SPARQL. [14] 3 NoSQL in the social and semantic Web context Semantic Web is an initiative of the World Wide Web Consortium (W3C) which involves transforming the Web so that the data available today can be understood and reused by machines. On a less abstract level this means attaching meta-data to the resources on the Web and to specify relationships between these resources. The core of the Semantic Web is a set of design principles, standards already widely used on the Web - XML, XML Schema, formal definitions of language used in expressing data models - Resource Description Framework (RDF), vocabulary for describing properties of models based on RDF - Resource Description Framework Schema (RDFS), vocabulary for creating ontologies - Ontology Web Language (OWL), data query services - SPARQL and other, under development, standards - Rule Interchange Format (RIF ), Unifying Logic and Proof layers. Social Web is the term used to describe how people socialize and interact each other throughout the WWW. Classic examples of distributed web applications that favored development of large social networks are: Facebook, MySpace, Linkedin, Flickr, Twitter,, etc. Regarding NoSQL influence on Semantic Web the vast list of database system developed, each exposing new techniques of managing data, contains some examples that may address problems like: managing RDF stores, managing ontologies or creating SPARQL endpoints. Neo4j is probably the most obvious example of such a store system. Its graph- oriented data model makes it perfect to store RDF triples or complex ontologies. Despite the fact that databases using this graph-oriented data view are able to manage a much reduced volume of information that the other types of non-relational data stores (key-value, column family, documents) this volume is still a large one: billions of nodes and relationships. Neo4j developers affirm that the traversal component of this system is a high-performance one and it’s over 6 years of enrolment in production rises the degree of confidence in this system. [14] HBase (The Hadoop Database) is a scalable, distributed, column oriented, dynamic schema database for structured data, modeled after Google Bigtable and under
  9. 9. development at ASF (Apache Software Foundation). HBase data model can be viewed as a multi-dimensional map where values are indexed by 4 keys (TableName, RowKey, ColumnKey, Timestamp). Values are binary data, rows are sorted in lexicographic order and columns are grouped in column families. The database schema is flexible and it can be modified at run-time. Such a dynamic schema allows this system to store Semantic Web data. An example of such a modeling can be found in [17]. Applications in the Social Web sphere have a longer history than Semantic Web applications so the scalability, performance, availability or huge volumes of data became issues vital to these applications. Cassandra, one of the most important non- relational distributed stores, is already used in production in large social applications: Facebook, Digg. A comparison with MySQL on 50 GB of data shows that Cassandra performs better. [11] Read Write MySQL ~350 ms ~300 ms Cassandra 0.12 ms 15 ms Table. 2. Performance comparison between MySQL and Cassandra on 50 GB of data 4 Conclusion RDMS have served large informational systems for over 30 year but current amount of data that needs to be managed causes multiple problems with these systems. In order to address problems like: scalability, performance, availability a new set of technologies and non-relational databases have been developed and they are collectively known under the term NoSQL. This paper presents the techniques and design practices that lye under these new database products most of which are inspired from already existing and reliable systems like Amazon’s Dynamo and Google’s Bigtable. Also few ideas on how these systems already influence the applications development for the semantic and social Web are expressed. The NoSQL trend began to grow rapidly in early 2009 and within a relatively short period of time a big number of non-relational database solutions appeared and part of them already became components of various large scale applications. As future research we are thinking at studying in even great detail the current techniques used in designing such system and possibly eliminating the vulnerabilities that may cause some of them to fail in certain scenarios.
  10. 10. References 1. Ellis, Jonathan: NoSQL Ecosystem, (2009) 2. Shoup, Randy: eBay Marketplace Architecture: Architectural Strategies, Patterns, and Focuses (2007) 3. Bloor, Robin: 6 Reason Why Relational Database Will Be Superseded (2008) 4. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G.,Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s Highly Available Key-value Store (2007) 5. Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach , D. A., Burrows, M., Chandra, T., Fikes, A., Gruber, R. E.: Bigtable: A Distributed Storage System for Structured Data (2006) 6. Brewer, Eric A.: Towards Robust Distributed Systems, Principles Of Distributed Computing (2000) 7. Vogels, W: Eventually Consistent, (2008) 8. Ho, Ricky: Pragmatic Programming Techniques, (2009) 9. Wiggins, Adam: SQL Databases Don’t Scale, (2009) 10. Browne, Julian: Brewer’s CAP Theorem, (2009) 11. NOSQL debrief, (2009) 12. Gupta, Vineet: NoSQL Databases – Part 1- Landscape, (2010) 13. NoSQL – Your Ultimate Guide to Non – Relational Universe, http://nosql- 14. Neo4j – the graph database, 15. Semantic Web, 16. Social Web, 17. Mateescu, Gabriel: Finding the way through the semantic Web with HBase, hbase/index.html?ca=dgr-twtrHBasedth- OS&S_TACT=105AGY83&S_CMP=TWDW (2009)