SlideShare a Scribd company logo
1 of 12
Download to read offline
An Oracle White Paper
September 2011



Oracle NoSQL Database
Oracle NoSQL Database




Introduction
NoSQL databases represent a recent evolution in enterprise application architecture,
continuing the evolution of the past twenty years. In the 1990’s, vertically integrated
applications gave way to client-server architectures, and more recently, client-server
architectures gave way to three-tier web application architectures. In parallel, the demands of
web-scale data analysis added map-reduce processing into the mix and data architects started
eschewing transactional consistency in exchange for incremental scalability and large-scale
distribution. The NoSQL movement emerged out of this second ecosystem.

NoSQL is often characterized by what it’s not – depending on whom you ask, it’s either not
only a SQL-based relational database management system or it’s simply not a SQL-based
RDBMS. While those definitions explain what NoSQL is not, they do little to explain what
NoSQL is. Consider the fundamentals that have guided data management for the past forty
years. RDBMS systems and large-scale data management have been characterized by the
transactional ACID properties of Atomicity, Consistency, Isolation, and Durability. In contrast,
NoSQL is sometimes characterized by the BASE acronym:

    Basically Available: Use replication to reduce the likelihood of data unavailability and use
    sharding, or partitioning the data among many different storage servers, to make any
    remaining failures partial. The result is a system that is always available, even if subsets of
    the data become unavailable for short periods of time.

    Soft state: While ACID systems assume that data consistency is a hard requirement,
    NoSQL systems allow data to be inconsistent and relegate designing around such
    inconsistencies to application developers.

    Eventually consistent: Although applications must deal with instantaneous consistency,
    NoSQL systems ensure that at some future point in time the data assumes a consistent
    state. In contrast to ACID systems that enforce consistency at transaction commit, NoSQL
    guarantees consistency only at some undefined future time.

NoSQL emerged as companies, such as Amazon, Google, LinkedIn and Twitter struggled to
deal with unprecedented data and operation volumes under tight latency constraints. Analyzing
high-volume, real time data, such as web-site click streams, provides significant business
advantage by harnessing unstructured and semi-structured data sources to create more
business value. Traditional relational databases were not up to the task, so enterprises built
upon a decade of research on distributed hash tables (DHTs) and either conventional
relational database systems or embedded key/value stores, such as Oracle’s Berkeley DB, to
develop highly available, distributed key-value stores.




                                                                                                      1
Oracle NoSQL Database




Although some of the early NoSQL solutions built their systems atop existing relational
database engines, they quickly realized that such systems were designed for SQL-based
access patterns and latency demands that are quite different from those of NoSQL systems,
so these same organizations began to develop brand new storage layers. In contrast, Oracle’s
Berkeley DB product line was the original key/value store; Oracle Berkeley DB Java Edition
has been in commercial use for over eight years. By using Oracle Berkeley DB Java Edition as
the underlying storage engine beneath a NoSQL system, Oracle brings enterprise robustness
and stability to the NoSQL landscape.

                                                                          Furthermore, until recently, integrating
                                                                          NoSQL solutions with an enterprise
                                          ",=!                            application architecture required manual
        =)?1!                =$600>!
                                                            :;@)3!        integration and custom development;
                            :;@)3!                            DBE!        Oracle’s NoSQL Database provides all
                          A$>B'6C%'!     !!!
   !"#$%&'!                              !                                the desirable features of NoSQL solutions
   /012,!)3!                             !
                                         !                                necessary for seamless integration into
                            "#$%&'!      !              :;@)3!            an enterprise application architecture.
                        )$7$!:;7'<#$70#! !              AF;F;<!           Figure 1 shows a canonical acquire-
                                         !!!
                                         !
                                           !"#$%&'!                       organize-analyze data cycle,
    !!"#$%&'!
      !!"#$%&'!                          !!"#$%&'!()*+!
      )$7$8$9'!
        (",-.+!                              45$6$7$!                     demonstrating how Oracle’s NoSQL
                                                            "3:!44!       Database fits into such an ecosystem.
                                                                          Oracle-provided adapters allow the
                                                                          Oracle NoSQL Database to integrate with
      !"#$%&'(                .&/*)%-'(                  !)*+,-'(         a Hadoop MapReduce framework or with
                                                                          the Oracle Database in-database
 ?F<C#'!GH!!"#$%&'!/012,!)$7$8$9'!!F;7'<#$7'9!9'$I&'99&J!F;70!7K'!6$7$!   MapReduce, Data Mining, R-based
 I$;$<'I';7!'%09J97'IL!                                                   analytics, or whatever business needs
                                                                          demand.

The Oracle NoSQL Database, with its “No Single Point of Failure” architecture is the right
solution when data access is “simple” in nature and application demands exceed the volume or
latency capability of traditional data management solutions. For example, click-stream data
from high volume web sites, high-throughput event processing, and social networking
communications all represent application domains that produce extraordinary volumes of
simple keyed data. Monitoring online retail behavior, accessing customer profiles, pulling up
appropriate customer ads and storing and forwarding real-time communication are examples
of domains requiring the ultimate in low-latency access. Highly distributed applications such as
real-time sensor aggregation and scalable authentication also represent domains well-suited to
Oracle NoSQL Database.




                                                                                                                      2
Oracle NoSQL Database




Technical Overview
Oracle NoSQL Database leverages the Oracle Berkeley DB Java Edition High Availability storage
engine to provide distributed, highly-available key/value storage for large-volume, latency-sensitive
applications or web services. It can also provide fast, reliable, distributed storage to applications that
need to integrate with ETL processing.

Data Model
In its simplest form, Oracle NoSQL Database implements a map from user-defined keys (Strings) to
opaque data items. It records version numbers for key/data pairs, but maintains the single latest
version in the store. Applications need never worry about reconciling incompatible versions because
Oracle NoSQL Database uses single-master replication; the master node always has the most up-to-
date value for a given key, while read-only replicas might have slightly older versions. Applications can
use version numbers to ensure consistency for read-modify-write operations.

Oracle NoSQL Database hashes keys to provide good distribution over a collection of computers that
provide storage for the database. However, applications can take advantage of subkey capabilities to
achieve data locality. A key is the concatenation of a Major Key Path and a Minor Key Path, both of which
are specified by the application. All records sharing a Major Key Path are co-located to achieve data-
locality. Within a co-located collection of Major Key Paths, the full key, comprised of both the Major
and Minor Key Paths, provides fast, indexed lookup. For example, an application storing user profiles
might use the profile-name as a Major Key Path and then have several Minor Key Paths for different
components of that profile such as email address, name, phone number, etc. Because applications have
complete control over the composition and interpretation of keys, different Major Key Paths can have
entirely different Minor Key Path structures. Continuing our previous example, one might store user
profiles and application profiles in the same store and maintain different Minor Key Paths for each.
Prefix key compression makes storage of key groups efficient.

While many NoSQL databases state that they provide eventual consistency, Oracle NoSQL Database
provides several different consistency policies. At one end of the spectrum, applications can specify
absolute consistency, which guarantees that all reads return the most recently written value for a
designated key. At the other end of the spectrum, applications capable of tolerating inconsistent data
can specify weak consistency, allowing the database to return a value efficiently even if it is not entirely
up to date. In between these two extremes, applications can specify time-based consistency to constrain
how old a record might be or version-based consistency to support both atomicity for read-modify-write
                                                                                  operations and reads that are at least as recent as the
 *+,-,#&''$) C-='%&-@' 5-6-'"8'%&.&=6'-8' ,;&%-F=#'1='G=1H=' ,;&%-F=#'1='6+&'
              86-/&'@-6-' 1D'-'#"E&='F<&' 1%'/-6&%'E&%8"1='  <186'%&.&=6'E&%8"1=' specified version. Figure 2 shows how the range of
                                                                                  flexible consistency policies enables developers to
 !"#$%$&'#!()    01=&'       *"<&>?-8&@'    A&%8"1=>?-8&@'        B781/$6&'
                                                                                  easily create business solutions providing data
            !"#$%&'()'*+&',%-./&'01234'5-6-7-8&'!/&9"7"/"6:'2;&.6%$<'             guarantees while meeting application latency and
                                                                                  scalability requirements.

Oracle NoSQL Database also provides a range of durability policies that specify what guarantees the
system makes after a crash. At one extreme, applications can request that write requests block until the
record has been written to stable storage on all copies. This has obvious performance and availability
implications, but ensures that if the application successfully writes data, that data will persist and can be
recovered even if all the copies become temporarily unavailable due to multiple simultaneous failures.
At the other extreme, applications can request that write operations return as soon as the system has




                                                                                                                                  .3
Oracle NoSQL Database




recorded the existence of the write, even if the data is not persistent anywhere. Such a policy provides
the best write performance, but provides no durability guarantees. By specifying when the database
writes records to disk and what fraction of the copies of the record must be persistent (none, all, or a
simple majority), applications can enforce a wide range of durability policies.

API
Incorporating Oracle NoSQL Database into applications is straightforward. APIs for basic Create,
Read, Update and Delete (CRUD) operations and a collection of iterators are packaged in a single jar
file. Applications can use the APIs from one or more client processes that access a stand-alone Oracle
NoSQL Database server process, alleviating the need to set up multi-system configurations for initial
development and testing.


Create, Remove, Update, and Delete
Data create and update operations are provided by several put methods. The putIfAbsent method
implements creation while the putIfPresent method implements update. The put method does both,
adding a new key/value pair if the key is not currently present in the database or updating the value if
the key does exist. Updating a key/value pair generates a new version of the pair, so the API also
includes a conditional putIfVersion method that allows applications to implement consistent read-
modify-write semantics.

The delete and deleteIfVersion methods unconditionally and conditionally remove key/value pairs from
the database, respectively. Just as putIfVersion ensures read-modify-write semantics, deleteIfVersion
provides deletion of a specific version.

The get method retrieves items from the database.

The code sample below demonstrates the use of the various CRUD APIs. All code samples assume
that you have already opened an Oracle NoSQL Database, referenced by the variable store.


                                       CRUD Examples
 // Put a new key/value pair in the database, if key not already present.
 Key key = Key.createKey("Katana");
 String valString = "sword";

 store.putIfAbsent(key, Value.createValue(valString.getBytes()));

 // Read the value back from the database.
 ValueVersion retValue = store.get(key);

 // Update this item, only if the current version matches the version I read.
 // In conjunction with the previous get, this implements a read-modify-write
 String newvalString = "Really nice sword";
 Value newval = Value.createValue(newvalString.getBytes());

 store.putIfVersion(key, newval, retValue.getVersion());

 // Finally, (unconditionally) delete this key/value pair from the database.
 store.delete(key);




                                                                                                      .4
Oracle NoSQL Database




Iteration
In addition to basic CRUD operations, Oracle NoSQL Database supports two types of iteration:
unordered iteration over records and ordered iteration within a Major Key set.

In the case of unordered iteration over the entire store, the result is not transactional; the iteration runs
at an isolation level of read-committed, which means that the result set will contain only key/value pairs
that have been persistently written to the database, but there are no guarantees of semantic consistency
across key/value pairs.

The API supports both individual key/value returns using several storeIterator methods and bulk
key/value returns within a Major Key Path via the various multiGetIterator methods. The example below
demonstrates iterating over an entire store, returning each key/value pair individually. Note that
although the iterator returns only a single key/value pair at a time, the storeIterator method takes a
second parameter of batchSize, indicating how many key/value pairs to fetch per network round trip.
This allows applications to simultaneously use network bandwidth efficiently, while maintaining the
simplicity of key-at-a-time access in the API.

                                    Unordered Iteration Example
  // Create Iterator.
  Iterator<KeyValueVersion> iter = store.storeIterator(Direction.UNORDERED, 100);

  // Now, iterate over the store.
  while (iter.hasNext()) {
         KeyValueVersion keyVV = iter.next();
         Value val = keyVV.getValue();
         Key key = keyVV.getKey();

            System.out.println(val.toString() + " " + key.toString() + "n");
  }



Bulk Operation API


In addition to providing single-record operations, Oracle NoSQL Database supports the ability to
bundle a collection of operations together using the execute method, providing transactional semantics
across multiple updates on records with the same Major Key Path. For example, let’s assume that we
have the Major Key Path “Katana” from the previous example, with several different Minor Key
Paths, containing attributes of the Katana, such as length and year of construction. Imagine that we
discover that we have an incorrect length and year of construction currently in our store. We can
update multiple records atomically using a sequence of operations as shown below.




                                                                                                         .5
Oracle NoSQL Database




               Example of Wrapping a Sequence of Operations in a Transaction
 // Create a sequence of operations.
 OperationFactory of = store.getOperationFactory();
 List<Operation> opList = new ArrayList<Operation>();

 // Create major and minor path components.
 List<String> majorComponents = new ArrayList<String>();
 List<String> minorLength = new ArrayList<String>();
 List<String> minorYear = new ArrayList<String>();

 majorComponents.add(“Katana”);
 minorLength .add(“length”);
 minorYear.add(“year”);

 Key key1 = Key.createKey(majorComponents, minorLength);
 Key key2 = Key.createKey(majorComponents, minorYear);

 // Now put operations in an opList.
 String lenVal = “37”;
 String yearVal = “1454”;

 opList.add(of.createPut(key1, Value.createValue(lenVal.getBytes())));
 opList.add(of.createPut(key2, Value.createValue(yearVal.getBytes())));

 // Now execute the operation list.
 store.execute(opList);




Administration

Oracle NoSQL Database comes with an Administration Service, accessible from both a command line
interface and a web console. Using the service, administrators can configure a database instance, start
it, stop it, and monitor system performance, without manually editing configuration files, writing shell
scripts, or performing explicit database operations.

The Administration Service is itself a highly-available service, but consistent with the Oracle NoSQL
Database “No Single Point of Failure” philosophy, the ongoing operation of an installation is not
dependent upon the availability of the Administration Service. Thus, both the database and the
Administration Service remain available during configuration changes.

In addition to facilitating configuration changes, the Administration Service also collects and maintains
performance statistics and logs important system events, providing online monitoring and input to
performance tuning.




                                                                                                      .6
Oracle NoSQL Database




Architecture
We present the Oracle NoSQL Database architecture by following the execution of an operation
through the logical components of the system and then discussing how those components map to
actual hardware and software operation. We will create the key/value pair “Katana” and “sword”.
Figure 3 depicts the method invocation putIfAbsent(“Katana”, “sword”)1.

                                                                                                                                                           The application issues the putIfAbsent
                                                                                                                                                           method to the Client Driver (step 1).
                        !""#$%&'()*                                                                                                                        The client driver hashes the key
                                                                                                                                                           “Katana” to select one of a fixed
                   +,-.*!/0*             12*"3405!678)49:;&4&)&<=*7>(?@A*                                                                                  number of partitions (step 2). The
                                                                                                                                                           number of partitions is fixed and set
      B2*/&?''()*C*D&7D9:;&4&)&<A*                  P#$8)4*-?$Q8?*
                                                                                                                                                           by an administrator at system
        I2*                                                                                                                                                configuration time and is chosen to be
                                                                                                   !"#$%          &'($%              )$*("+*$%     ,"*-%
                                                                                                                                                           significantly larger than the maximum
                                                                                                  FK1L)1*         M&748?*                NO*     H,LB1*
                     J2*F8"#$%&'()*G?(3"* F8"#$%&'()*
        /&?''()*
         E&6#8*
                                                          G?(3"*H4&48*                     N*     FK1L)B*         F8"#$%&*              1OO*     H,LBB*    number of storage nodes expected in
                                                             E&6#8*
                                                                                                  FK1L)I*         F8"#$%&*               1O*     H,LI*     the store. In this example, our store
                                                                                                                                                           contains 25 storage nodes, so we
                               R2*"3405!678)49:;&4&)&<=*7>(?@A*
                                                                                                                                                           might have configured the system to
                      H,L1*
                         H,L1*
                                    S*      H,L1*
                                               H,L1*
                                                                   H,L1*
                                                                      H,L1*
                                                                                         H,L1*
                                                                                            H,L1*
                                                                                                               H,L1*
                                                                                                                  H,L1*
                                                                                                                                                           have 25,000 partitions. Each partition
                            H,L1*
                               H,L1*
                                                  H,L1*
                                                     H,L1*
                                                                         H,L1*
                                                                            H,L1*
                                                                                               H,L1*
                                                                                                  H,L1*
                                                                                                                     H,L1*
                                                                                                                        H,L1*
                                                                                                                                                           is assigned to a particular replication
                                    H,L1*                 H,LB*                  H,LI*                 H,LJ*                 H,LN*
                                                                                                                                                           group. The driver consults the
                                                                  H4(?&K8*,(@87*
                                                                                                                                                           partition table (step 3) to map the
                                                      T$K3?8*IU*F8V3874*/?(%877$)K*
                                                                                                                                                           partition number to a replication group.

A replication group consists of some (configurable) number of replication nodes. Every replication group
consists of the same number of replication nodes. The number of replication nodes in a replication
group dictates the number of failures from which the system is resilient; a system with three nodes per
group can withstand two failures while continuing to service read requests. Its ability to withstand
failures on writes is based upon the configured durability policy. If the application does not require a
majority of participants to acknowledge a write, then the system can also withstand up to two failures
for writes. A five-node group can withstand up to four failures for reads and up to two failures for
writes, even if the application demands a durability policy requiring a majority of sites to acknowledge a
write operation.

Given a replication group, the Client Driver next consults the Replication Group State Table (RGST)
(step 4). For each replication group, the RGST contains information about each replication node
comprising the group (step 5). Based upon information in the RGST, such as the identity of the master
and the load on the various nodes in a replication group, the Client Driver selects the node to which to
send the request and forwards the request to the appropriate node (step 6). In this case, since we are
issuing a write operation, the request must go to the master node.




1   Although the API takes byte[] as Values, for convenience, we are showing the values as Strings.




                                                                                                                                                                                               .7
Oracle NoSQL Database




The replication node then applies the operation. In the case of a putIfAbsent, if the key exists, the
operation has no effect and returns an error, indicating that the specified entry is already present in the
store. If the key does not exist, the replication node adds the key/value pair to the store and then
propagates the new key/value pair to the other nodes in the replication group (step 7).




Implementation
An Oracle NoSQL Database installation consists of two major pieces: a Client Driver and a collection of
Storage Nodes. As shown in Figure 3, the client driver implements the partition map and the RGST,
while storage nodes implement the replication nodes comprising replication groups. In this section,
we’ll take a closer look at each of these components.

Storage Nodes
A storage node (SN) is typically a physical machine with its own local persistent storage, either disk or
solid state, a CPU with one or more cores, memory, and an IP address. A system with more storage
nodes will provide greater aggregate throughput or storage capacity than one with fewer nodes, and
systems with a greater degree of replication in replication groups can provide decreased request latency
over installations with smaller degrees of replication.

A Storage Node Agent (SNA) runs on each storage node, monitoring that node’s behavior. The SNA
both receives configuration from and reports monitoring information to the Administration Service,
which interfaces to the Oracle NoSQL Database monitoring dashboard. The SNA collects operational
data from the storage node on an ongoing basis and then delivers it to the Administration Service
when asked for it.

A storage node serves one or more replication nodes. Each replication node belongs to a single
replication group. The nodes in a single replication group all serve the same data. Each group has a
designated master node that handles all data modification operations (create, update, and delete). The
other nodes are read-only replicas, but may assume the role of master should the master node fail. A
typical installation uses a replication factor of three in the replication groups, to ensure that the system
can survive at least two simultaneous faults and still continue to service read operations. Applications
requiring greater or lesser reliability can adjust this parameter accordingly.


                                      =(=(=( =(=(=(                  3'4(!5%6'(


                                                                  7%+8(9:;6<#;(
                                                                                                                                                         Figure 4 shows an installation with 30
                                                                                                                                                         replication groups (0-29). Each replication
                                                                                                                                                         group has a replication factor of 3 (one master
                                                                                                                                                         and two replicas) spread across two data centers.
                                                             I%$<<#;():JK'$+(
                                                 =(=(=(                   =(=(=(
                                                                          =(=(=(                  =(=(=(                                                 Note that we place two of the replication nodes
               >'5?@6%<#;(&$#:5(A(                                                                                    >'5?@6%<#;(&$#:5(.B(               in the larger of the two data centers and the last
                                                                                                                                                         replication node in the smaller one. This sort of
   >'5()#*'(
    C%+"'$(
                     >'5()#*'(
                      >'5?@6%(
                                           >'5()#*'(
                                            >'5?@6%(
                                                                                                        >'5()#*'(
                                                                                                         C%+"'$(
                                                                                                                              >'5()#*'(
                                                                                                                               >'5?@6%(
                                                                                                                                             >'5()#*'(
                                                                                                                                              >'5?@6%(   arrangement might be appropriate for an
                                                                                                                                                         application that uses the larger data center for its
                  !),-(
                     !),-(
                                      !),-(
                                         !),-(
                                                          !),-(
                                                             !),-(
                                                                              !),-(
                                                                                 !),-(
                                                                                                  !),-(
                                                                                                     !),-(
                                                                                                                      !),-(
                                                                                                                         !),-(
                                                                                                                                                         primary data access, maintaining the smaller data
                                                                                                                                                         center in case of catastrophic failure of the
                        !),-(               !),-(               !),-(               !),-(               !),-(               !),-(
                           !),-(               !),-(               !),-(               !),-(               !),-(               !),-(
                              !),-(               !),.(               !),/(               !),0(               !),1(               !),2(




                       D%"%(E';"'$(-(                                                                                    D%"%(E';"'$(.(                  primary data center. The 30 replication groups
                                                              !"#$%&'()#*'+(
                                                                                                                                                         are stored on 30 storage nodes, spread across
                                                  F@&:$'(0G(H$68@"'6":$'(                                                                                the two data centers.




                                                                                                                                                                                                        .8
Oracle NoSQL Database




Replication nodes support the Oracle NoSQL Database API via RMI calls from the client and obtain
data directly from or write data directly to the log-structured storage system, which provides
outstanding write performance, while maintaining index structures that provide low-latency read
performance as well. The Oracle NoSQL Database storage engine pioneered the use of log-structured
storage in key/value databases since its initial deployment in 2003 and has been proven in several
open-source NoSQL solutions, such as Dynamo, Voldemort, and GenieDB, as well as in Enterprise
deployments.

Oracle NoSQL Database uses replication to ensure data availability in the case of failure. Its single-
master architecture requires that writes are applied at the master node and then propagated to the
replicas. In the case of failure of the master node, the nodes in a replication group automatically hold a
reliable election (using the Paxos protocol), electing one of the remaining nodes to be the master. The
new master then assumes write responsibility.

Client Driver
The client driver is a Java jar file that exports the API to applications. In addition, the client driver
maintains a copy of the Topology and the Replication Group State Table (RGST). The Topology efficiently
maps keys to partitions and from partitions to replication groups. For each replication group, it
includes the host name of the storage node hosting each replication node in the group, the service
name associated with the replication nodes, and the data center in which each storage node resides.
The client then uses the RGST for two primary purposes: identifying the master node of a replication
group, so that it can send write requests to the master, and load balancing across all the nodes in a
replication group for reads. Since the RGST is a critical shared data structure, each client and
replication node maintains its own copy, thus avoiding any single point of failure. Both clients and
replication nodes run a RequestDispatcher that use the RGST to (re)direct write requests to the master
and read requests to the appropriate member of a replication group.

The Topology is loaded during client or replication node initialization and can subsequently be updated
by the administrator if there are Topology changes. The RGST is dynamic, requiring ongoing
maintenance. Each replication node runs a thread, called the Replication Node State Update thread, that is
responsible for ongoing maintenance of the RGST. The update thread, as well as the RequestDispatcher,
opportunistically collect information on remote replication nodes including the current state of the
node in its replication group, an indication of how up-to-date the node is, the time of the last
successful interaction with the node, the node’s trailing average response time, and the current length
of its outstanding request queue. In addition, the update thread maintains network connections and re-
establishes broken ones. This maintenance is done outside the RequestDispatcher’s request/response
cycle to minimize the impact of broken connections on latency.

Performance
We have experimented with various Oracle NoSQL Database configurations and present a few
performance results of the Yahoo! Cloud Serving Benchmark (YCSB), demonstrating how the system
scales with the number of nodes in the system. As with all performance measurements, your mileage
may vary.

We applied a constant YCSB load per storage node to configurations of varying sizes. Each storage
node was comprised of a 2.93ghz Westmere 5670 dual socket machine with 6 cores/socket and 24GB
of memory. Each machine had a 300GB local disk and ran RedHat 2.6.18-164.11.1.el5.crt1. At 300




                                                                                                       .9
Oracle NoSQL Database




GB, the disk size is the scale-limiting resource on each node, dictating the overall configuration, so we
configured each node to hold 100M records, with an average key size of 13 bytes and data size of 1108
bytes.

                                                   The graph to the left shows the raw insert
                                                   performance of Oracle NoSQL Database for
                                                   configurations ranging from a single replication
                                                   group system with three nodes storing 100 million
                                                   records to a system with 32 replication groups on 96
                                                   nodes storing 2.1 billion records (the YCSB
                                                   benchmark is limited to a maximum of 2.1 billion
                                                   records). The graph shows both the throughput in
                                                   operations per second (blue line and left axis) and
                                                   the response time in milliseconds (red line and right
                                                   axis). Throughput of the system scales almost
                                                   linearly as the database size and number of
                                                   replication groups grows, with only a modest
                                                   increase in response time.

                                                    The second graph shows the throughput and
                                                   response time for a workload of 50% reads and 50%
                                                   updates. As the system grows in size (both data size
                                                   and number of replication groups), we see both the
                                                   update and read latency decline, while throughput
                                                   scales almost linearly, delivering the scalability
                                                   needed for today’s demanding applications.




Conclusion
Oracle’s NoSQL Database brings enterprise quality storage and performance to the highly-available,
widely distributed NoSQL environment. Its commercially proven, write-optimized storage system
delivers outstanding performance as well as robustness and reliability, and its “No Single Point of
Failure” design ensures that the system continues to run and data remain available after any failure.




                                                                                                    .10
Oracle NoSQL Database      Copyright © 2011, Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only and the
September 2011             contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other
                           warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or
                           fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are
Oracle Corporation
                           formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any
World Headquarters
                           means, electronic or mechanical, for any purpose, without our prior written permission.
500 Oracle Parkway
Redwood Shores, CA 94065
                           Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
U.S.A.

Worldwide Inquiries:       AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices.
Phone: +1.650.506.7000     Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license
Fax: +1.650.506.7200       and are trademarks or registered trademarks of SPARC International, Inc. UNIX is a registered trademark licensed through X/Open
                           Company, Ltd. 1010
oracle.com

More Related Content

What's hot

JSON and the Oracle Database
JSON and the Oracle DatabaseJSON and the Oracle Database
JSON and the Oracle DatabaseMaria Colgan
 
Oracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewOracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewDave Segleau
 
Overview of oracle database
Overview of oracle databaseOverview of oracle database
Overview of oracle databaseSamar Prasad
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 OverviewDavid Chou
 
Big SQL 3.0: Datawarehouse-grade Performance on Hadoop - At last!
Big SQL 3.0: Datawarehouse-grade Performance on Hadoop - At last!Big SQL 3.0: Datawarehouse-grade Performance on Hadoop - At last!
Big SQL 3.0: Datawarehouse-grade Performance on Hadoop - At last!Nicolas Morales
 
Solution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataSolution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataInfiniteGraph
 
Database@Home : The Future is Data Driven
Database@Home : The Future is Data DrivenDatabase@Home : The Future is Data Driven
Database@Home : The Future is Data DrivenTammy Bednar
 
Resume of Derek
Resume of DerekResume of Derek
Resume of DerekDerek Xu
 
How Oracle Single/Multitenant will change a DBA's life
How Oracle Single/Multitenant will change a DBA's lifeHow Oracle Single/Multitenant will change a DBA's life
How Oracle Single/Multitenant will change a DBA's lifeGuatemala User Group
 
Eng systems oracle_overview
Eng systems oracle_overviewEng systems oracle_overview
Eng systems oracle_overviewFran Navarro
 
38_Spotkanie_PLSSUGweWroclawiu_KilkaSłów_ApexSQL_FreeTools
38_Spotkanie_PLSSUGweWroclawiu_KilkaSłów_ApexSQL_FreeTools38_Spotkanie_PLSSUGweWroclawiu_KilkaSłów_ApexSQL_FreeTools
38_Spotkanie_PLSSUGweWroclawiu_KilkaSłów_ApexSQL_FreeToolsTobias Koprowski
 
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019Dave Stokes
 

What's hot (18)

JSON and the Oracle Database
JSON and the Oracle DatabaseJSON and the Oracle Database
JSON and the Oracle Database
 
Oracle archi ppt
Oracle archi pptOracle archi ppt
Oracle archi ppt
 
Oracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewOracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overview
 
Overview of oracle database
Overview of oracle databaseOverview of oracle database
Overview of oracle database
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 Overview
 
Big SQL 3.0: Datawarehouse-grade Performance on Hadoop - At last!
Big SQL 3.0: Datawarehouse-grade Performance on Hadoop - At last!Big SQL 3.0: Datawarehouse-grade Performance on Hadoop - At last!
Big SQL 3.0: Datawarehouse-grade Performance on Hadoop - At last!
 
Solution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataSolution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big Data
 
Introduction to Oracle
Introduction to OracleIntroduction to Oracle
Introduction to Oracle
 
Oracle
OracleOracle
Oracle
 
Database@Home : The Future is Data Driven
Database@Home : The Future is Data DrivenDatabase@Home : The Future is Data Driven
Database@Home : The Future is Data Driven
 
Resume of Derek
Resume of DerekResume of Derek
Resume of Derek
 
How Oracle Single/Multitenant will change a DBA's life
How Oracle Single/Multitenant will change a DBA's lifeHow Oracle Single/Multitenant will change a DBA's life
How Oracle Single/Multitenant will change a DBA's life
 
Eng systems oracle_overview
Eng systems oracle_overviewEng systems oracle_overview
Eng systems oracle_overview
 
38_Spotkanie_PLSSUGweWroclawiu_KilkaSłów_ApexSQL_FreeTools
38_Spotkanie_PLSSUGweWroclawiu_KilkaSłów_ApexSQL_FreeTools38_Spotkanie_PLSSUGweWroclawiu_KilkaSłów_ApexSQL_FreeTools
38_Spotkanie_PLSSUGweWroclawiu_KilkaSłów_ApexSQL_FreeTools
 
Mohamed Mahgoub_CV
Mohamed Mahgoub_CVMohamed Mahgoub_CV
Mohamed Mahgoub_CV
 
Exploring sql server 2016 bi
Exploring sql server 2016 biExploring sql server 2016 bi
Exploring sql server 2016 bi
 
Resume_of_sayeed
Resume_of_sayeedResume_of_sayeed
Resume_of_sayeed
 
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
 

Viewers also liked

Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...
Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...
Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...João Gabriel Lima
 
Segurança na Internet - Conceitos fundamentais
Segurança na Internet - Conceitos fundamentaisSegurança na Internet - Conceitos fundamentais
Segurança na Internet - Conceitos fundamentaisJoão Gabriel Lima
 
Segurança na Internet - Estudos de caso
Segurança na Internet - Estudos de casoSegurança na Internet - Estudos de caso
Segurança na Internet - Estudos de casoJoão Gabriel Lima
 
Segurança na Internet - Google Hacking
Segurança na Internet - Google  HackingSegurança na Internet - Google  Hacking
Segurança na Internet - Google HackingJoão Gabriel Lima
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBLee Theobald
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureVenu Anuganti
 

Viewers also liked (7)

Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...
Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...
Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...
 
Segurança na Internet - Conceitos fundamentais
Segurança na Internet - Conceitos fundamentaisSegurança na Internet - Conceitos fundamentais
Segurança na Internet - Conceitos fundamentais
 
Web Machine Learning
Web Machine LearningWeb Machine Learning
Web Machine Learning
 
Segurança na Internet - Estudos de caso
Segurança na Internet - Estudos de casoSegurança na Internet - Estudos de caso
Segurança na Internet - Estudos de caso
 
Segurança na Internet - Google Hacking
Segurança na Internet - Google  HackingSegurança na Internet - Google  Hacking
Segurança na Internet - Google Hacking
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 

Similar to Oracle no sql database bigdata

Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Dave Stokes
 
MySQL Document Store for Modern Applications
MySQL Document Store for Modern ApplicationsMySQL Document Store for Modern Applications
MySQL Document Store for Modern ApplicationsOlivier DASINI
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sqlRam kumar
 
SQL vs NoSQL deep dive
SQL vs NoSQL deep diveSQL vs NoSQL deep dive
SQL vs NoSQL deep diveAhmed Shaaban
 
No sql databases explained
No sql databases explainedNo sql databases explained
No sql databases explainedSalil Mehendale
 
1. introduction to no sql
1. introduction to no sql1. introduction to no sql
1. introduction to no sqlAnuja Gunale
 
Unit II -BIG DATA ANALYTICS.docx
Unit II -BIG DATA ANALYTICS.docxUnit II -BIG DATA ANALYTICS.docx
Unit II -BIG DATA ANALYTICS.docxvvpadhu
 
Sql vs NoSQL-Presentation
 Sql vs NoSQL-Presentation Sql vs NoSQL-Presentation
Sql vs NoSQL-PresentationShubham Tomar
 
Vskills Apache Cassandra sample material
Vskills Apache Cassandra sample materialVskills Apache Cassandra sample material
Vskills Apache Cassandra sample materialVskills
 
Brief introduction to NoSQL by fas mosleh
Brief introduction to NoSQL by fas moslehBrief introduction to NoSQL by fas mosleh
Brief introduction to NoSQL by fas moslehFas (Feisal) Mosleh
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLbalwinders
 
CS828 P5 Individual Project v101
CS828 P5 Individual Project v101CS828 P5 Individual Project v101
CS828 P5 Individual Project v101ThienSi Le
 
Experimental evaluation of no sql databases
Experimental evaluation of no sql databasesExperimental evaluation of no sql databases
Experimental evaluation of no sql databasesijdms
 
Why no sql_ibm_cloudant
Why no sql_ibm_cloudantWhy no sql_ibm_cloudant
Why no sql_ibm_cloudantPeter Tutty
 
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013Facundo Farias
 

Similar to Oracle no sql database bigdata (20)

Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016
 
No sql database
No sql databaseNo sql database
No sql database
 
MySQL Document Store for Modern Applications
MySQL Document Store for Modern ApplicationsMySQL Document Store for Modern Applications
MySQL Document Store for Modern Applications
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
SQL vs NoSQL deep dive
SQL vs NoSQL deep diveSQL vs NoSQL deep dive
SQL vs NoSQL deep dive
 
Unit 3 MongDB
Unit 3 MongDBUnit 3 MongDB
Unit 3 MongDB
 
No sql databases explained
No sql databases explainedNo sql databases explained
No sql databases explained
 
1. introduction to no sql
1. introduction to no sql1. introduction to no sql
1. introduction to no sql
 
NoSQL Basics and MongDB
NoSQL Basics and  MongDBNoSQL Basics and  MongDB
NoSQL Basics and MongDB
 
Unit II -BIG DATA ANALYTICS.docx
Unit II -BIG DATA ANALYTICS.docxUnit II -BIG DATA ANALYTICS.docx
Unit II -BIG DATA ANALYTICS.docx
 
Sql vs NoSQL-Presentation
 Sql vs NoSQL-Presentation Sql vs NoSQL-Presentation
Sql vs NoSQL-Presentation
 
Vskills Apache Cassandra sample material
Vskills Apache Cassandra sample materialVskills Apache Cassandra sample material
Vskills Apache Cassandra sample material
 
NoSQL
NoSQLNoSQL
NoSQL
 
Brief introduction to NoSQL by fas mosleh
Brief introduction to NoSQL by fas moslehBrief introduction to NoSQL by fas mosleh
Brief introduction to NoSQL by fas mosleh
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
CS828 P5 Individual Project v101
CS828 P5 Individual Project v101CS828 P5 Individual Project v101
CS828 P5 Individual Project v101
 
Experimental evaluation of no sql databases
Experimental evaluation of no sql databasesExperimental evaluation of no sql databases
Experimental evaluation of no sql databases
 
Why no sql_ibm_cloudant
Why no sql_ibm_cloudantWhy no sql_ibm_cloudant
Why no sql_ibm_cloudant
 
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
 
SQL & NoSQL
SQL & NoSQLSQL & NoSQL
SQL & NoSQL
 

More from João Gabriel Lima

Deep marketing - Indoor Customer Segmentation
Deep marketing - Indoor Customer SegmentationDeep marketing - Indoor Customer Segmentation
Deep marketing - Indoor Customer SegmentationJoão Gabriel Lima
 
Aplicações de Alto Desempenho com JHipster Full Stack
Aplicações de Alto Desempenho com JHipster Full StackAplicações de Alto Desempenho com JHipster Full Stack
Aplicações de Alto Desempenho com JHipster Full StackJoão Gabriel Lima
 
Realidade aumentada com react native e ARKit
Realidade aumentada com react native e ARKitRealidade aumentada com react native e ARKit
Realidade aumentada com react native e ARKitJoão Gabriel Lima
 
Big data e Inteligência Artificial
Big data e Inteligência ArtificialBig data e Inteligência Artificial
Big data e Inteligência ArtificialJoão Gabriel Lima
 
Mineração de Dados no Weka - Regressão Linear
Mineração de Dados no Weka -  Regressão LinearMineração de Dados no Weka -  Regressão Linear
Mineração de Dados no Weka - Regressão LinearJoão Gabriel Lima
 
Mineração de dados com RapidMiner + WEKA - Clusterização
Mineração de dados com RapidMiner + WEKA - ClusterizaçãoMineração de dados com RapidMiner + WEKA - Clusterização
Mineração de dados com RapidMiner + WEKA - ClusterizaçãoJoão Gabriel Lima
 
Mineração de dados na prática com RapidMiner e Weka
Mineração de dados na prática com RapidMiner e WekaMineração de dados na prática com RapidMiner e Weka
Mineração de dados na prática com RapidMiner e WekaJoão Gabriel Lima
 
Visualizacao de dados - Come to the dark side
Visualizacao de dados - Come to the dark sideVisualizacao de dados - Come to the dark side
Visualizacao de dados - Come to the dark sideJoão Gabriel Lima
 
REST x SOAP : Qual abordagem escolher?
REST x SOAP : Qual abordagem escolher?REST x SOAP : Qual abordagem escolher?
REST x SOAP : Qual abordagem escolher?João Gabriel Lima
 
Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...
Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...
Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...João Gabriel Lima
 
E-trânsito cidadão - IPVA em suas mãos
E-trânsito cidadão - IPVA em suas mãosE-trânsito cidadão - IPVA em suas mãos
E-trânsito cidadão - IPVA em suas mãosJoão Gabriel Lima
 
[Estácio - IESAM] Automatizando Tarefas com Gulp.js
[Estácio - IESAM] Automatizando Tarefas com Gulp.js[Estácio - IESAM] Automatizando Tarefas com Gulp.js
[Estácio - IESAM] Automatizando Tarefas com Gulp.jsJoão Gabriel Lima
 
Hackeando a Internet das Coisas com Javascript
Hackeando a Internet das Coisas com JavascriptHackeando a Internet das Coisas com Javascript
Hackeando a Internet das Coisas com JavascriptJoão Gabriel Lima
 
Mercado de Trabalho em Computação - Perfil Analista de Sistemas e Programador
Mercado de Trabalho em Computação - Perfil Analista de Sistemas e ProgramadorMercado de Trabalho em Computação - Perfil Analista de Sistemas e Programador
Mercado de Trabalho em Computação - Perfil Analista de Sistemas e ProgramadorJoão Gabriel Lima
 
[Fapan] criando aplicações mobile híbridas com ionic framework
[Fapan] criando aplicações mobile híbridas com ionic framework[Fapan] criando aplicações mobile híbridas com ionic framework
[Fapan] criando aplicações mobile híbridas com ionic frameworkJoão Gabriel Lima
 
Criando Aplicações Mobile Híbridas com Ionic Framework
Criando Aplicações Mobile Híbridas com Ionic FrameworkCriando Aplicações Mobile Híbridas com Ionic Framework
Criando Aplicações Mobile Híbridas com Ionic FrameworkJoão Gabriel Lima
 
Codifique uma vez, distribua em Android, iOS e Windows Phone
Codifique uma vez, distribua em Android, iOS e Windows PhoneCodifique uma vez, distribua em Android, iOS e Windows Phone
Codifique uma vez, distribua em Android, iOS e Windows PhoneJoão Gabriel Lima
 
Desbravando o mundo dos webcrawlers
Desbravando o mundo dos webcrawlersDesbravando o mundo dos webcrawlers
Desbravando o mundo dos webcrawlersJoão Gabriel Lima
 

More from João Gabriel Lima (20)

Cooking with data
Cooking with dataCooking with data
Cooking with data
 
Deep marketing - Indoor Customer Segmentation
Deep marketing - Indoor Customer SegmentationDeep marketing - Indoor Customer Segmentation
Deep marketing - Indoor Customer Segmentation
 
Aplicações de Alto Desempenho com JHipster Full Stack
Aplicações de Alto Desempenho com JHipster Full StackAplicações de Alto Desempenho com JHipster Full Stack
Aplicações de Alto Desempenho com JHipster Full Stack
 
Realidade aumentada com react native e ARKit
Realidade aumentada com react native e ARKitRealidade aumentada com react native e ARKit
Realidade aumentada com react native e ARKit
 
JS - IA
JS - IAJS - IA
JS - IA
 
Big data e Inteligência Artificial
Big data e Inteligência ArtificialBig data e Inteligência Artificial
Big data e Inteligência Artificial
 
Mineração de Dados no Weka - Regressão Linear
Mineração de Dados no Weka -  Regressão LinearMineração de Dados no Weka -  Regressão Linear
Mineração de Dados no Weka - Regressão Linear
 
Mineração de dados com RapidMiner + WEKA - Clusterização
Mineração de dados com RapidMiner + WEKA - ClusterizaçãoMineração de dados com RapidMiner + WEKA - Clusterização
Mineração de dados com RapidMiner + WEKA - Clusterização
 
Mineração de dados na prática com RapidMiner e Weka
Mineração de dados na prática com RapidMiner e WekaMineração de dados na prática com RapidMiner e Weka
Mineração de dados na prática com RapidMiner e Weka
 
Visualizacao de dados - Come to the dark side
Visualizacao de dados - Come to the dark sideVisualizacao de dados - Come to the dark side
Visualizacao de dados - Come to the dark side
 
REST x SOAP : Qual abordagem escolher?
REST x SOAP : Qual abordagem escolher?REST x SOAP : Qual abordagem escolher?
REST x SOAP : Qual abordagem escolher?
 
Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...
Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...
Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...
 
E-trânsito cidadão - IPVA em suas mãos
E-trânsito cidadão - IPVA em suas mãosE-trânsito cidadão - IPVA em suas mãos
E-trânsito cidadão - IPVA em suas mãos
 
[Estácio - IESAM] Automatizando Tarefas com Gulp.js
[Estácio - IESAM] Automatizando Tarefas com Gulp.js[Estácio - IESAM] Automatizando Tarefas com Gulp.js
[Estácio - IESAM] Automatizando Tarefas com Gulp.js
 
Hackeando a Internet das Coisas com Javascript
Hackeando a Internet das Coisas com JavascriptHackeando a Internet das Coisas com Javascript
Hackeando a Internet das Coisas com Javascript
 
Mercado de Trabalho em Computação - Perfil Analista de Sistemas e Programador
Mercado de Trabalho em Computação - Perfil Analista de Sistemas e ProgramadorMercado de Trabalho em Computação - Perfil Analista de Sistemas e Programador
Mercado de Trabalho em Computação - Perfil Analista de Sistemas e Programador
 
[Fapan] criando aplicações mobile híbridas com ionic framework
[Fapan] criando aplicações mobile híbridas com ionic framework[Fapan] criando aplicações mobile híbridas com ionic framework
[Fapan] criando aplicações mobile híbridas com ionic framework
 
Criando Aplicações Mobile Híbridas com Ionic Framework
Criando Aplicações Mobile Híbridas com Ionic FrameworkCriando Aplicações Mobile Híbridas com Ionic Framework
Criando Aplicações Mobile Híbridas com Ionic Framework
 
Codifique uma vez, distribua em Android, iOS e Windows Phone
Codifique uma vez, distribua em Android, iOS e Windows PhoneCodifique uma vez, distribua em Android, iOS e Windows Phone
Codifique uma vez, distribua em Android, iOS e Windows Phone
 
Desbravando o mundo dos webcrawlers
Desbravando o mundo dos webcrawlersDesbravando o mundo dos webcrawlers
Desbravando o mundo dos webcrawlers
 

Recently uploaded

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Recently uploaded (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Oracle no sql database bigdata

  • 1. An Oracle White Paper September 2011 Oracle NoSQL Database
  • 2. Oracle NoSQL Database Introduction NoSQL databases represent a recent evolution in enterprise application architecture, continuing the evolution of the past twenty years. In the 1990’s, vertically integrated applications gave way to client-server architectures, and more recently, client-server architectures gave way to three-tier web application architectures. In parallel, the demands of web-scale data analysis added map-reduce processing into the mix and data architects started eschewing transactional consistency in exchange for incremental scalability and large-scale distribution. The NoSQL movement emerged out of this second ecosystem. NoSQL is often characterized by what it’s not – depending on whom you ask, it’s either not only a SQL-based relational database management system or it’s simply not a SQL-based RDBMS. While those definitions explain what NoSQL is not, they do little to explain what NoSQL is. Consider the fundamentals that have guided data management for the past forty years. RDBMS systems and large-scale data management have been characterized by the transactional ACID properties of Atomicity, Consistency, Isolation, and Durability. In contrast, NoSQL is sometimes characterized by the BASE acronym: Basically Available: Use replication to reduce the likelihood of data unavailability and use sharding, or partitioning the data among many different storage servers, to make any remaining failures partial. The result is a system that is always available, even if subsets of the data become unavailable for short periods of time. Soft state: While ACID systems assume that data consistency is a hard requirement, NoSQL systems allow data to be inconsistent and relegate designing around such inconsistencies to application developers. Eventually consistent: Although applications must deal with instantaneous consistency, NoSQL systems ensure that at some future point in time the data assumes a consistent state. In contrast to ACID systems that enforce consistency at transaction commit, NoSQL guarantees consistency only at some undefined future time. NoSQL emerged as companies, such as Amazon, Google, LinkedIn and Twitter struggled to deal with unprecedented data and operation volumes under tight latency constraints. Analyzing high-volume, real time data, such as web-site click streams, provides significant business advantage by harnessing unstructured and semi-structured data sources to create more business value. Traditional relational databases were not up to the task, so enterprises built upon a decade of research on distributed hash tables (DHTs) and either conventional relational database systems or embedded key/value stores, such as Oracle’s Berkeley DB, to develop highly available, distributed key-value stores. 1
  • 3. Oracle NoSQL Database Although some of the early NoSQL solutions built their systems atop existing relational database engines, they quickly realized that such systems were designed for SQL-based access patterns and latency demands that are quite different from those of NoSQL systems, so these same organizations began to develop brand new storage layers. In contrast, Oracle’s Berkeley DB product line was the original key/value store; Oracle Berkeley DB Java Edition has been in commercial use for over eight years. By using Oracle Berkeley DB Java Edition as the underlying storage engine beneath a NoSQL system, Oracle brings enterprise robustness and stability to the NoSQL landscape. Furthermore, until recently, integrating NoSQL solutions with an enterprise ",=! application architecture required manual =)?1! =$600>! :;@)3! integration and custom development; :;@)3! DBE! Oracle’s NoSQL Database provides all A$>B'6C%'! !!! !"#$%&'! ! the desirable features of NoSQL solutions /012,!)3! ! ! necessary for seamless integration into "#$%&'! ! :;@)3! an enterprise application architecture. )$7$!:;7'<#$70#! ! AF;F;<! Figure 1 shows a canonical acquire- !!! ! !"#$%&'! organize-analyze data cycle, !!"#$%&'! !!"#$%&'! !!"#$%&'!()*+! )$7$8$9'! (",-.+! 45$6$7$! demonstrating how Oracle’s NoSQL "3:!44! Database fits into such an ecosystem. Oracle-provided adapters allow the Oracle NoSQL Database to integrate with !"#$%&'( .&/*)%-'( !)*+,-'( a Hadoop MapReduce framework or with the Oracle Database in-database ?F<C#'!GH!!"#$%&'!/012,!)$7$8$9'!!F;7'<#$7'9!9'$I&'99&J!F;70!7K'!6$7$! MapReduce, Data Mining, R-based I$;$<'I';7!'%09J97'IL! analytics, or whatever business needs demand. The Oracle NoSQL Database, with its “No Single Point of Failure” architecture is the right solution when data access is “simple” in nature and application demands exceed the volume or latency capability of traditional data management solutions. For example, click-stream data from high volume web sites, high-throughput event processing, and social networking communications all represent application domains that produce extraordinary volumes of simple keyed data. Monitoring online retail behavior, accessing customer profiles, pulling up appropriate customer ads and storing and forwarding real-time communication are examples of domains requiring the ultimate in low-latency access. Highly distributed applications such as real-time sensor aggregation and scalable authentication also represent domains well-suited to Oracle NoSQL Database. 2
  • 4. Oracle NoSQL Database Technical Overview Oracle NoSQL Database leverages the Oracle Berkeley DB Java Edition High Availability storage engine to provide distributed, highly-available key/value storage for large-volume, latency-sensitive applications or web services. It can also provide fast, reliable, distributed storage to applications that need to integrate with ETL processing. Data Model In its simplest form, Oracle NoSQL Database implements a map from user-defined keys (Strings) to opaque data items. It records version numbers for key/data pairs, but maintains the single latest version in the store. Applications need never worry about reconciling incompatible versions because Oracle NoSQL Database uses single-master replication; the master node always has the most up-to- date value for a given key, while read-only replicas might have slightly older versions. Applications can use version numbers to ensure consistency for read-modify-write operations. Oracle NoSQL Database hashes keys to provide good distribution over a collection of computers that provide storage for the database. However, applications can take advantage of subkey capabilities to achieve data locality. A key is the concatenation of a Major Key Path and a Minor Key Path, both of which are specified by the application. All records sharing a Major Key Path are co-located to achieve data- locality. Within a co-located collection of Major Key Paths, the full key, comprised of both the Major and Minor Key Paths, provides fast, indexed lookup. For example, an application storing user profiles might use the profile-name as a Major Key Path and then have several Minor Key Paths for different components of that profile such as email address, name, phone number, etc. Because applications have complete control over the composition and interpretation of keys, different Major Key Paths can have entirely different Minor Key Path structures. Continuing our previous example, one might store user profiles and application profiles in the same store and maintain different Minor Key Paths for each. Prefix key compression makes storage of key groups efficient. While many NoSQL databases state that they provide eventual consistency, Oracle NoSQL Database provides several different consistency policies. At one end of the spectrum, applications can specify absolute consistency, which guarantees that all reads return the most recently written value for a designated key. At the other end of the spectrum, applications capable of tolerating inconsistent data can specify weak consistency, allowing the database to return a value efficiently even if it is not entirely up to date. In between these two extremes, applications can specify time-based consistency to constrain how old a record might be or version-based consistency to support both atomicity for read-modify-write operations and reads that are at least as recent as the *+,-,#&''$) C-='%&-@' 5-6-'"8'%&.&=6'-8' ,;&%-F=#'1='G=1H=' ,;&%-F=#'1='6+&' 86-/&'@-6-' 1D'-'#"E&='F<&' 1%'/-6&%'E&%8"1=' <186'%&.&=6'E&%8"1=' specified version. Figure 2 shows how the range of flexible consistency policies enables developers to !"#$%$&'#!() 01=&' *"<&>?-8&@' A&%8"1=>?-8&@' B781/$6&' easily create business solutions providing data !"#$%&'()'*+&',%-./&'01234'5-6-7-8&'!/&9"7"/"6:'2;&.6%$<' guarantees while meeting application latency and scalability requirements. Oracle NoSQL Database also provides a range of durability policies that specify what guarantees the system makes after a crash. At one extreme, applications can request that write requests block until the record has been written to stable storage on all copies. This has obvious performance and availability implications, but ensures that if the application successfully writes data, that data will persist and can be recovered even if all the copies become temporarily unavailable due to multiple simultaneous failures. At the other extreme, applications can request that write operations return as soon as the system has .3
  • 5. Oracle NoSQL Database recorded the existence of the write, even if the data is not persistent anywhere. Such a policy provides the best write performance, but provides no durability guarantees. By specifying when the database writes records to disk and what fraction of the copies of the record must be persistent (none, all, or a simple majority), applications can enforce a wide range of durability policies. API Incorporating Oracle NoSQL Database into applications is straightforward. APIs for basic Create, Read, Update and Delete (CRUD) operations and a collection of iterators are packaged in a single jar file. Applications can use the APIs from one or more client processes that access a stand-alone Oracle NoSQL Database server process, alleviating the need to set up multi-system configurations for initial development and testing. Create, Remove, Update, and Delete Data create and update operations are provided by several put methods. The putIfAbsent method implements creation while the putIfPresent method implements update. The put method does both, adding a new key/value pair if the key is not currently present in the database or updating the value if the key does exist. Updating a key/value pair generates a new version of the pair, so the API also includes a conditional putIfVersion method that allows applications to implement consistent read- modify-write semantics. The delete and deleteIfVersion methods unconditionally and conditionally remove key/value pairs from the database, respectively. Just as putIfVersion ensures read-modify-write semantics, deleteIfVersion provides deletion of a specific version. The get method retrieves items from the database. The code sample below demonstrates the use of the various CRUD APIs. All code samples assume that you have already opened an Oracle NoSQL Database, referenced by the variable store. CRUD Examples // Put a new key/value pair in the database, if key not already present. Key key = Key.createKey("Katana"); String valString = "sword"; store.putIfAbsent(key, Value.createValue(valString.getBytes())); // Read the value back from the database. ValueVersion retValue = store.get(key); // Update this item, only if the current version matches the version I read. // In conjunction with the previous get, this implements a read-modify-write String newvalString = "Really nice sword"; Value newval = Value.createValue(newvalString.getBytes()); store.putIfVersion(key, newval, retValue.getVersion()); // Finally, (unconditionally) delete this key/value pair from the database. store.delete(key); .4
  • 6. Oracle NoSQL Database Iteration In addition to basic CRUD operations, Oracle NoSQL Database supports two types of iteration: unordered iteration over records and ordered iteration within a Major Key set. In the case of unordered iteration over the entire store, the result is not transactional; the iteration runs at an isolation level of read-committed, which means that the result set will contain only key/value pairs that have been persistently written to the database, but there are no guarantees of semantic consistency across key/value pairs. The API supports both individual key/value returns using several storeIterator methods and bulk key/value returns within a Major Key Path via the various multiGetIterator methods. The example below demonstrates iterating over an entire store, returning each key/value pair individually. Note that although the iterator returns only a single key/value pair at a time, the storeIterator method takes a second parameter of batchSize, indicating how many key/value pairs to fetch per network round trip. This allows applications to simultaneously use network bandwidth efficiently, while maintaining the simplicity of key-at-a-time access in the API. Unordered Iteration Example // Create Iterator. Iterator<KeyValueVersion> iter = store.storeIterator(Direction.UNORDERED, 100); // Now, iterate over the store. while (iter.hasNext()) { KeyValueVersion keyVV = iter.next(); Value val = keyVV.getValue(); Key key = keyVV.getKey(); System.out.println(val.toString() + " " + key.toString() + "n"); } Bulk Operation API In addition to providing single-record operations, Oracle NoSQL Database supports the ability to bundle a collection of operations together using the execute method, providing transactional semantics across multiple updates on records with the same Major Key Path. For example, let’s assume that we have the Major Key Path “Katana” from the previous example, with several different Minor Key Paths, containing attributes of the Katana, such as length and year of construction. Imagine that we discover that we have an incorrect length and year of construction currently in our store. We can update multiple records atomically using a sequence of operations as shown below. .5
  • 7. Oracle NoSQL Database Example of Wrapping a Sequence of Operations in a Transaction // Create a sequence of operations. OperationFactory of = store.getOperationFactory(); List<Operation> opList = new ArrayList<Operation>(); // Create major and minor path components. List<String> majorComponents = new ArrayList<String>(); List<String> minorLength = new ArrayList<String>(); List<String> minorYear = new ArrayList<String>(); majorComponents.add(“Katana”); minorLength .add(“length”); minorYear.add(“year”); Key key1 = Key.createKey(majorComponents, minorLength); Key key2 = Key.createKey(majorComponents, minorYear); // Now put operations in an opList. String lenVal = “37”; String yearVal = “1454”; opList.add(of.createPut(key1, Value.createValue(lenVal.getBytes()))); opList.add(of.createPut(key2, Value.createValue(yearVal.getBytes()))); // Now execute the operation list. store.execute(opList); Administration Oracle NoSQL Database comes with an Administration Service, accessible from both a command line interface and a web console. Using the service, administrators can configure a database instance, start it, stop it, and monitor system performance, without manually editing configuration files, writing shell scripts, or performing explicit database operations. The Administration Service is itself a highly-available service, but consistent with the Oracle NoSQL Database “No Single Point of Failure” philosophy, the ongoing operation of an installation is not dependent upon the availability of the Administration Service. Thus, both the database and the Administration Service remain available during configuration changes. In addition to facilitating configuration changes, the Administration Service also collects and maintains performance statistics and logs important system events, providing online monitoring and input to performance tuning. .6
  • 8. Oracle NoSQL Database Architecture We present the Oracle NoSQL Database architecture by following the execution of an operation through the logical components of the system and then discussing how those components map to actual hardware and software operation. We will create the key/value pair “Katana” and “sword”. Figure 3 depicts the method invocation putIfAbsent(“Katana”, “sword”)1. The application issues the putIfAbsent method to the Client Driver (step 1). !""#$%&'()* The client driver hashes the key “Katana” to select one of a fixed +,-.*!/0* 12*"3405!678)49:;&4&)&<=*7>(?@A* number of partitions (step 2). The number of partitions is fixed and set B2*/&?''()*C*D&7D9:;&4&)&<A* P#$8)4*-?$Q8?* by an administrator at system I2* configuration time and is chosen to be !"#$% &'($% )$*("+*$% ,"*-% significantly larger than the maximum FK1L)1* M&748?* NO* H,LB1* J2*F8"#$%&'()*G?(3"* F8"#$%&'()* /&?''()* E&6#8* G?(3"*H4&48* N* FK1L)B* F8"#$%&* 1OO* H,LBB* number of storage nodes expected in E&6#8* FK1L)I* F8"#$%&* 1O* H,LI* the store. In this example, our store contains 25 storage nodes, so we R2*"3405!678)49:;&4&)&<=*7>(?@A* might have configured the system to H,L1* H,L1* S* H,L1* H,L1* H,L1* H,L1* H,L1* H,L1* H,L1* H,L1* have 25,000 partitions. Each partition H,L1* H,L1* H,L1* H,L1* H,L1* H,L1* H,L1* H,L1* H,L1* H,L1* is assigned to a particular replication H,L1* H,LB* H,LI* H,LJ* H,LN* group. The driver consults the H4(?&K8*,(@87* partition table (step 3) to map the T$K3?8*IU*F8V3874*/?(%877$)K* partition number to a replication group. A replication group consists of some (configurable) number of replication nodes. Every replication group consists of the same number of replication nodes. The number of replication nodes in a replication group dictates the number of failures from which the system is resilient; a system with three nodes per group can withstand two failures while continuing to service read requests. Its ability to withstand failures on writes is based upon the configured durability policy. If the application does not require a majority of participants to acknowledge a write, then the system can also withstand up to two failures for writes. A five-node group can withstand up to four failures for reads and up to two failures for writes, even if the application demands a durability policy requiring a majority of sites to acknowledge a write operation. Given a replication group, the Client Driver next consults the Replication Group State Table (RGST) (step 4). For each replication group, the RGST contains information about each replication node comprising the group (step 5). Based upon information in the RGST, such as the identity of the master and the load on the various nodes in a replication group, the Client Driver selects the node to which to send the request and forwards the request to the appropriate node (step 6). In this case, since we are issuing a write operation, the request must go to the master node. 1 Although the API takes byte[] as Values, for convenience, we are showing the values as Strings. .7
  • 9. Oracle NoSQL Database The replication node then applies the operation. In the case of a putIfAbsent, if the key exists, the operation has no effect and returns an error, indicating that the specified entry is already present in the store. If the key does not exist, the replication node adds the key/value pair to the store and then propagates the new key/value pair to the other nodes in the replication group (step 7). Implementation An Oracle NoSQL Database installation consists of two major pieces: a Client Driver and a collection of Storage Nodes. As shown in Figure 3, the client driver implements the partition map and the RGST, while storage nodes implement the replication nodes comprising replication groups. In this section, we’ll take a closer look at each of these components. Storage Nodes A storage node (SN) is typically a physical machine with its own local persistent storage, either disk or solid state, a CPU with one or more cores, memory, and an IP address. A system with more storage nodes will provide greater aggregate throughput or storage capacity than one with fewer nodes, and systems with a greater degree of replication in replication groups can provide decreased request latency over installations with smaller degrees of replication. A Storage Node Agent (SNA) runs on each storage node, monitoring that node’s behavior. The SNA both receives configuration from and reports monitoring information to the Administration Service, which interfaces to the Oracle NoSQL Database monitoring dashboard. The SNA collects operational data from the storage node on an ongoing basis and then delivers it to the Administration Service when asked for it. A storage node serves one or more replication nodes. Each replication node belongs to a single replication group. The nodes in a single replication group all serve the same data. Each group has a designated master node that handles all data modification operations (create, update, and delete). The other nodes are read-only replicas, but may assume the role of master should the master node fail. A typical installation uses a replication factor of three in the replication groups, to ensure that the system can survive at least two simultaneous faults and still continue to service read operations. Applications requiring greater or lesser reliability can adjust this parameter accordingly. =(=(=( =(=(=( 3'4(!5%6'( 7%+8(9:;6<#;( Figure 4 shows an installation with 30 replication groups (0-29). Each replication group has a replication factor of 3 (one master and two replicas) spread across two data centers. I%$<<#;():JK'$+( =(=(=( =(=(=( =(=(=( =(=(=( Note that we place two of the replication nodes >'5?@6%<#;(&$#:5(A( >'5?@6%<#;(&$#:5(.B( in the larger of the two data centers and the last replication node in the smaller one. This sort of >'5()#*'( C%+"'$( >'5()#*'( >'5?@6%( >'5()#*'( >'5?@6%( >'5()#*'( C%+"'$( >'5()#*'( >'5?@6%( >'5()#*'( >'5?@6%( arrangement might be appropriate for an application that uses the larger data center for its !),-( !),-( !),-( !),-( !),-( !),-( !),-( !),-( !),-( !),-( !),-( !),-( primary data access, maintaining the smaller data center in case of catastrophic failure of the !),-( !),-( !),-( !),-( !),-( !),-( !),-( !),-( !),-( !),-( !),-( !),-( !),-( !),.( !),/( !),0( !),1( !),2( D%"%(E';"'$(-( D%"%(E';"'$(.( primary data center. The 30 replication groups !"#$%&'()#*'+( are stored on 30 storage nodes, spread across F@&:$'(0G(H$68@"'6":$'( the two data centers. .8
  • 10. Oracle NoSQL Database Replication nodes support the Oracle NoSQL Database API via RMI calls from the client and obtain data directly from or write data directly to the log-structured storage system, which provides outstanding write performance, while maintaining index structures that provide low-latency read performance as well. The Oracle NoSQL Database storage engine pioneered the use of log-structured storage in key/value databases since its initial deployment in 2003 and has been proven in several open-source NoSQL solutions, such as Dynamo, Voldemort, and GenieDB, as well as in Enterprise deployments. Oracle NoSQL Database uses replication to ensure data availability in the case of failure. Its single- master architecture requires that writes are applied at the master node and then propagated to the replicas. In the case of failure of the master node, the nodes in a replication group automatically hold a reliable election (using the Paxos protocol), electing one of the remaining nodes to be the master. The new master then assumes write responsibility. Client Driver The client driver is a Java jar file that exports the API to applications. In addition, the client driver maintains a copy of the Topology and the Replication Group State Table (RGST). The Topology efficiently maps keys to partitions and from partitions to replication groups. For each replication group, it includes the host name of the storage node hosting each replication node in the group, the service name associated with the replication nodes, and the data center in which each storage node resides. The client then uses the RGST for two primary purposes: identifying the master node of a replication group, so that it can send write requests to the master, and load balancing across all the nodes in a replication group for reads. Since the RGST is a critical shared data structure, each client and replication node maintains its own copy, thus avoiding any single point of failure. Both clients and replication nodes run a RequestDispatcher that use the RGST to (re)direct write requests to the master and read requests to the appropriate member of a replication group. The Topology is loaded during client or replication node initialization and can subsequently be updated by the administrator if there are Topology changes. The RGST is dynamic, requiring ongoing maintenance. Each replication node runs a thread, called the Replication Node State Update thread, that is responsible for ongoing maintenance of the RGST. The update thread, as well as the RequestDispatcher, opportunistically collect information on remote replication nodes including the current state of the node in its replication group, an indication of how up-to-date the node is, the time of the last successful interaction with the node, the node’s trailing average response time, and the current length of its outstanding request queue. In addition, the update thread maintains network connections and re- establishes broken ones. This maintenance is done outside the RequestDispatcher’s request/response cycle to minimize the impact of broken connections on latency. Performance We have experimented with various Oracle NoSQL Database configurations and present a few performance results of the Yahoo! Cloud Serving Benchmark (YCSB), demonstrating how the system scales with the number of nodes in the system. As with all performance measurements, your mileage may vary. We applied a constant YCSB load per storage node to configurations of varying sizes. Each storage node was comprised of a 2.93ghz Westmere 5670 dual socket machine with 6 cores/socket and 24GB of memory. Each machine had a 300GB local disk and ran RedHat 2.6.18-164.11.1.el5.crt1. At 300 .9
  • 11. Oracle NoSQL Database GB, the disk size is the scale-limiting resource on each node, dictating the overall configuration, so we configured each node to hold 100M records, with an average key size of 13 bytes and data size of 1108 bytes. The graph to the left shows the raw insert performance of Oracle NoSQL Database for configurations ranging from a single replication group system with three nodes storing 100 million records to a system with 32 replication groups on 96 nodes storing 2.1 billion records (the YCSB benchmark is limited to a maximum of 2.1 billion records). The graph shows both the throughput in operations per second (blue line and left axis) and the response time in milliseconds (red line and right axis). Throughput of the system scales almost linearly as the database size and number of replication groups grows, with only a modest increase in response time. The second graph shows the throughput and response time for a workload of 50% reads and 50% updates. As the system grows in size (both data size and number of replication groups), we see both the update and read latency decline, while throughput scales almost linearly, delivering the scalability needed for today’s demanding applications. Conclusion Oracle’s NoSQL Database brings enterprise quality storage and performance to the highly-available, widely distributed NoSQL environment. Its commercially proven, write-optimized storage system delivers outstanding performance as well as robustness and reliability, and its “No Single Point of Failure” design ensures that the system continues to run and data remain available after any failure. .10
  • 12. Oracle NoSQL Database Copyright © 2011, Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only and the September 2011 contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are Oracle Corporation formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any World Headquarters means, electronic or mechanical, for any purpose, without our prior written permission. 500 Oracle Parkway Redwood Shores, CA 94065 Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. U.S.A. Worldwide Inquiries: AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. Phone: +1.650.506.7000 Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license Fax: +1.650.506.7200 and are trademarks or registered trademarks of SPARC International, Inc. UNIX is a registered trademark licensed through X/Open Company, Ltd. 1010 oracle.com