Many institutions and companies with technological development have been producing large size of structured and unstructured data. Therefore, we need special databases to deal with these data and thus emerged NoSQL databases. They are widely used in the cloud databases and the distributed systems. In the era of big data, those databases provide a scalable high availability solution. So we need new architectures to try to meet the need to store more and more different kinds of different data. In order to arrive at a good structure of large and diverse data, this structure must be tested and analyzed in depth with the use of different benchmark tools. In this paper, we experiment the Riak key-value database to measure their performance in terms of throughput and latency, where huge amounts of data are stored and retrieved in different sizes in a distributed database environment. Throughput and latency of the NoSQL database over different types of experiments and different sizes of data are compared and then results were discussed.
Analysis and evaluation of riak kv cluster environment using basho benchStevenChike
Many institutions and companies with technological development have been producing large size of structured and unstructured data. Therefore, we need special databases to deal with these data and thus emerged NoSQL databases. They are widely used in the cloud databases and the distributed systems. In the era of big data, those databases provide a scalable high availability solution. So we need new architectures to try to meet the need to store more and more different kinds of different data. In order to arrive at a good structure of large and diverse data, this structure must be tested and analyzed in depth with the use of different benchmark tools. In this paper, we experiment the Riak key-value database to measure their performance in terms of throughput and latency, where huge amounts of data are stored and retrieved in different sizes in a distributed database environment. Throughput and latency of the NoSQL database over different types of experiments and different sizes of data are compared and then results were discussed.
Very basic Introduction to Big Data. Touches on what it is, characteristics, some examples of Big Data frameworks. Hadoop 2.0 example - Yarn, HDFS and Map-Reduce with Zookeeper.
The growth of data and its effi cient handling is becoming more popular trend in recent years bringing
new challenges to explore new avenues. Data analytics can be done more effi ciently with the availability of
distributed architecture of “Not Only SQL” NoSQL databases.
Analysis and evaluation of riak kv cluster environment using basho benchStevenChike
Many institutions and companies with technological development have been producing large size of structured and unstructured data. Therefore, we need special databases to deal with these data and thus emerged NoSQL databases. They are widely used in the cloud databases and the distributed systems. In the era of big data, those databases provide a scalable high availability solution. So we need new architectures to try to meet the need to store more and more different kinds of different data. In order to arrive at a good structure of large and diverse data, this structure must be tested and analyzed in depth with the use of different benchmark tools. In this paper, we experiment the Riak key-value database to measure their performance in terms of throughput and latency, where huge amounts of data are stored and retrieved in different sizes in a distributed database environment. Throughput and latency of the NoSQL database over different types of experiments and different sizes of data are compared and then results were discussed.
Very basic Introduction to Big Data. Touches on what it is, characteristics, some examples of Big Data frameworks. Hadoop 2.0 example - Yarn, HDFS and Map-Reduce with Zookeeper.
The growth of data and its effi cient handling is becoming more popular trend in recent years bringing
new challenges to explore new avenues. Data analytics can be done more effi ciently with the availability of
distributed architecture of “Not Only SQL” NoSQL databases.
This white paper will present the opportunities laid down by
data lake and advanced analytics, as well as, the challenges
in integrating, mining and analyzing the data collected from
these sources. It goes over the important characteristics of
the data lake architecture and Data and Analytics as a
Service (DAaaS) model. It also delves into the features of a
successful data lake and its optimal designing. It goes over
data, applications, and analytics that are strung together to
speed-up the insight brewing process for industry’s
improvements with the help of a powerful architecture for
mining and analyzing unstructured data – data lake.
This article useful for anyone who want to introduce with Big Data and how oracle architecture Big Data solution using Oracle Big Data Cloud solutions .
Big Data is used to store huge volume of both structured and unstructured data which is so large and is
hard to process using current / traditional database tools and software technologies. The goal of Big Data
Storage Management is to ensure a high level of data quality and availability for business intellect and big
data analytics applications. Graph database which is not most popular NoSQL database compare to
relational database yet but it is a most powerful NoSQL database which can handle large volume of data in
very efficient way. It is very difficult to manage large volume of data using traditional technology. Data
retrieval time may be more as per database size gets increase. As solution of that NoSQL databases are
available. This paper describe what is big data storage management, dimensions of big data, types of data,
what is structured and unstructured data, what is NoSQL database, types of NoSQL database, basic
structure of graph database, advantages, disadvantages and application area and comparison of various
graph database.
Processing cassandra datasets with hadoop streaming based approachesLeMeniz Infotech
Processing cassandra datasets with hadoop streaming based approaches
Do Your Projects With Technology Experts
To Get this projects Call : 9566355386 / 99625 88976
Web : http://www.lemenizinfotech.com
Web : http://www.ieeemaster.com
Mail : projects@lemenizinfotech.com
Blog : http://ieeeprojectspondicherry.weebly.com
Blog : http://www.ieeeprojectsinpondicherry.blogspot.in/
Youtube:https://www.youtube.com/watch?v=eesBNUnKvws
My other computer is a datacentre - 2012 editionSteve Loughran
An updated version of the "my other computer is a datacentre" talk, presented at the Bristol University HPC talk.
Because it is targeted at universities, it emphasises some of the interesting problems -the classic CS ones of scheduling, new ones of availability and failure handling within what is now a single computer, and emergent problems of power and heterogeneity. It also includes references, all of which are worth reading, and, being mostly Google and Microsoft papers, are free to download without needing ACM or IEEE library access.
Comments welcome.
HYBRID DATABASE SYSTEM FOR BIG DATA STORAGE AND MANAGEMENTIJCSEA Journal
Relational database systems have been the standard storage system over the last forty years. Recently,
advancements in technologies have led to an exponential increase in data volume, velocity and variety
beyond what relational databases can handle. Developers are turning to NoSQL which is a non- relational
database for data storage and management. Some core features of database system such as ACID have
been compromised in NOSQL databases. This work proposed a hybrid database system for the storage and
management of extremely voluminous data of diverse components known as big data, such that the two
models are integrated in one system to eliminate the limitations of the individual systems. The system is
implemented in MongoDB which is a NoSQL database and SQL. The results obtained, revealed that having
these two databases in one system can enhance storage and management of big data bridging the gap
between relational and NoSQL storage approach.
HYBRID DATABASE SYSTEM FOR BIG DATA STORAGE AND MANAGEMENTIJCSEA Journal
Relational database systems have been the standard storage system over the last forty years. Recently, advancements in technologies have led to an exponential increase in data volume, velocity and variety beyond what relational databases can handle. Developers are turning to NoSQL which is a non- relational database for data storage and management. Some core features of database system such as ACID have been compromised in NOSQL databases. This work proposed a hybrid database system for the storage and management of extremely voluminous data of diverse components known as big data, such that the two models are integrated in one system to eliminate the limitations of the individual systems. The system is
implemented in MongoDB which is a NoSQL database and SQL. The results obtained, revealed that having these two databases in one system can enhance storage and management of big data bridging the gap between relational and NoSQL storage approach.
No sql databases new millennium database for big data, big users, cloud compu...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
TOP NEWSQL DATABASES AND FEATURES CLASSIFICATIONijdms
Versatility of NewSQL databases is to achieve low latency constrains as well as to reduce cost commodity
nodes. Out work emphasize on how big data is addressed through top NewSQL databases considering their
features. This NewSQL databases paper conveys some of the top NewSQL databases [54] features collection
considering high demand and usage. First part, around 11 NewSQL databases have been investigated for
eliciting, comparing and examining their features so that they might assist to observe high hierarchy of
NewSQL databases and to reveal their similarities and their differences. Our taxonomy involves four types
categories in terms of how NewSQL databases handle, and process big data considering technologies are
offered or supported. Advantages and disadvantages are conveyed in this survey for each of NewSQL
databases. At second part, we register our findings based on several categories and aspects: first, by our
first taxonomy which sees features characteristics are either functional or non-functional. A second
taxonomy moved into another aspect regarding data integrity and data manipulation; we found data
features classified based on supervised, semi-supervised, or unsupervised. Third taxonomy was about how
diverse each single NewSQL database can deal with different types of databases. Surprisingly, Not only do
NewSQL databases process regular (raw) data, but also they are stringent enough to afford diverse type of
data such as historical and vertical distributed system, real-time, streaming, and timestamp databases.
Thereby we release NewSQL databases are significant enough to survive and associate with other
technologies to support other database types such as NoSQL, traditional, distributed system, and semirelationship
to be as our fourth taxonomy-based. We strive to visualize our results for the former categories
and the latter using chart graph. Eventually, NewSQL databases motivate us to analyze its big data
throughput and we could classify them into good data or bad data. We conclude this paper with couple
suggestions in how to manage big data using Predictable Analytics and other techniques.
This white paper will present the opportunities laid down by
data lake and advanced analytics, as well as, the challenges
in integrating, mining and analyzing the data collected from
these sources. It goes over the important characteristics of
the data lake architecture and Data and Analytics as a
Service (DAaaS) model. It also delves into the features of a
successful data lake and its optimal designing. It goes over
data, applications, and analytics that are strung together to
speed-up the insight brewing process for industry’s
improvements with the help of a powerful architecture for
mining and analyzing unstructured data – data lake.
This article useful for anyone who want to introduce with Big Data and how oracle architecture Big Data solution using Oracle Big Data Cloud solutions .
Big Data is used to store huge volume of both structured and unstructured data which is so large and is
hard to process using current / traditional database tools and software technologies. The goal of Big Data
Storage Management is to ensure a high level of data quality and availability for business intellect and big
data analytics applications. Graph database which is not most popular NoSQL database compare to
relational database yet but it is a most powerful NoSQL database which can handle large volume of data in
very efficient way. It is very difficult to manage large volume of data using traditional technology. Data
retrieval time may be more as per database size gets increase. As solution of that NoSQL databases are
available. This paper describe what is big data storage management, dimensions of big data, types of data,
what is structured and unstructured data, what is NoSQL database, types of NoSQL database, basic
structure of graph database, advantages, disadvantages and application area and comparison of various
graph database.
Processing cassandra datasets with hadoop streaming based approachesLeMeniz Infotech
Processing cassandra datasets with hadoop streaming based approaches
Do Your Projects With Technology Experts
To Get this projects Call : 9566355386 / 99625 88976
Web : http://www.lemenizinfotech.com
Web : http://www.ieeemaster.com
Mail : projects@lemenizinfotech.com
Blog : http://ieeeprojectspondicherry.weebly.com
Blog : http://www.ieeeprojectsinpondicherry.blogspot.in/
Youtube:https://www.youtube.com/watch?v=eesBNUnKvws
My other computer is a datacentre - 2012 editionSteve Loughran
An updated version of the "my other computer is a datacentre" talk, presented at the Bristol University HPC talk.
Because it is targeted at universities, it emphasises some of the interesting problems -the classic CS ones of scheduling, new ones of availability and failure handling within what is now a single computer, and emergent problems of power and heterogeneity. It also includes references, all of which are worth reading, and, being mostly Google and Microsoft papers, are free to download without needing ACM or IEEE library access.
Comments welcome.
HYBRID DATABASE SYSTEM FOR BIG DATA STORAGE AND MANAGEMENTIJCSEA Journal
Relational database systems have been the standard storage system over the last forty years. Recently,
advancements in technologies have led to an exponential increase in data volume, velocity and variety
beyond what relational databases can handle. Developers are turning to NoSQL which is a non- relational
database for data storage and management. Some core features of database system such as ACID have
been compromised in NOSQL databases. This work proposed a hybrid database system for the storage and
management of extremely voluminous data of diverse components known as big data, such that the two
models are integrated in one system to eliminate the limitations of the individual systems. The system is
implemented in MongoDB which is a NoSQL database and SQL. The results obtained, revealed that having
these two databases in one system can enhance storage and management of big data bridging the gap
between relational and NoSQL storage approach.
HYBRID DATABASE SYSTEM FOR BIG DATA STORAGE AND MANAGEMENTIJCSEA Journal
Relational database systems have been the standard storage system over the last forty years. Recently, advancements in technologies have led to an exponential increase in data volume, velocity and variety beyond what relational databases can handle. Developers are turning to NoSQL which is a non- relational database for data storage and management. Some core features of database system such as ACID have been compromised in NOSQL databases. This work proposed a hybrid database system for the storage and management of extremely voluminous data of diverse components known as big data, such that the two models are integrated in one system to eliminate the limitations of the individual systems. The system is
implemented in MongoDB which is a NoSQL database and SQL. The results obtained, revealed that having these two databases in one system can enhance storage and management of big data bridging the gap between relational and NoSQL storage approach.
No sql databases new millennium database for big data, big users, cloud compu...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
TOP NEWSQL DATABASES AND FEATURES CLASSIFICATIONijdms
Versatility of NewSQL databases is to achieve low latency constrains as well as to reduce cost commodity
nodes. Out work emphasize on how big data is addressed through top NewSQL databases considering their
features. This NewSQL databases paper conveys some of the top NewSQL databases [54] features collection
considering high demand and usage. First part, around 11 NewSQL databases have been investigated for
eliciting, comparing and examining their features so that they might assist to observe high hierarchy of
NewSQL databases and to reveal their similarities and their differences. Our taxonomy involves four types
categories in terms of how NewSQL databases handle, and process big data considering technologies are
offered or supported. Advantages and disadvantages are conveyed in this survey for each of NewSQL
databases. At second part, we register our findings based on several categories and aspects: first, by our
first taxonomy which sees features characteristics are either functional or non-functional. A second
taxonomy moved into another aspect regarding data integrity and data manipulation; we found data
features classified based on supervised, semi-supervised, or unsupervised. Third taxonomy was about how
diverse each single NewSQL database can deal with different types of databases. Surprisingly, Not only do
NewSQL databases process regular (raw) data, but also they are stringent enough to afford diverse type of
data such as historical and vertical distributed system, real-time, streaming, and timestamp databases.
Thereby we release NewSQL databases are significant enough to survive and associate with other
technologies to support other database types such as NoSQL, traditional, distributed system, and semirelationship
to be as our fourth taxonomy-based. We strive to visualize our results for the former categories
and the latter using chart graph. Eventually, NewSQL databases motivate us to analyze its big data
throughput and we could classify them into good data or bad data. We conclude this paper with couple
suggestions in how to manage big data using Predictable Analytics and other techniques.
A Study on Graph Storage Database of NOSQLIJSCAI Journal
Big Data is used to store huge volume of both structured and unstructured data which is so large and is
hard to process using current / traditional database tools and software technologies. The goal of Big Data
Storage Management is to ensure a high level of data quality and availability for business intellect and big
data analytics applications. Graph database which is not most popular NoSQL database compare to
relational database yet but it is a most powerful NoSQL database which can handle large volume of data in
very efficient way. It is very difficult to manage large volume of data using traditional technology. Data
retrieval time may be more as per database size gets increase. As solution of that NoSQL databases are
available. This paper describe what is big data storage management, dimensions of big data, types of data,
what is structured and unstructured data, what is NoSQL database, types of NoSQL database, basic
structure of graph database, advantages, disadvantages and application area and comparison of various
graph database.
Big Data is used to store huge volume of both structured and unstructured data which is so large and is
hard to process using current / traditional database tools and software technologies. The goal of Big Data
Storage Management is to ensure a high level of data quality and availability for business intellect and big
data analytics applications. Graph database which is not most popular NoSQL database compare to
relational database yet but it is a most powerful NoSQL database which can handle large volume of data in
very efficient way. It is very difficult to manage large volume of data using traditional technology. Data
retrieval time may be more as per database size gets increase. As solution of that NoSQL databases are
available. This paper describe what is big data storage management, dimensions of big data, types of data,
what is structured and unstructured data, what is NoSQL database, types of NoSQL database, basic
structure of graph database, advantages, disadvantages and application area and comparison of various
graph database.
A Study on Graph Storage Database of NOSQLIJSCAI Journal
Big Data is used to store huge volume of both structured and unstructured data which is so large and is
hard to process using current / traditional database tools and software technologies. The goal of Big Data
Storage Management is to ensure a high level of data quality and availability for business intellect and big
data analytics applications. Graph database which is not most popular NoSQL database compare to
relational database yet but it is a most powerful NoSQL database which can handle large volume of data in
very efficient way. It is very difficult to manage large volume of data using traditional technology. Data
retrieval time may be more as per database size gets increase. As solution of that NoSQL databases are
available. This paper describe what is big data storage management, dimensions of big data, types of data,
what is structured and unstructured data, what is NoSQL database, types of NoSQL database, basic
structure of graph database, advantages, disadvantages and application area and comparison of various
graph database.
SQL vs NoSQL: Big Data Adoption & Success in the EnterpriseAnita Luthra
Overview of SQL vs NoSQL. When to use NoSQL vs structured databases. Shows roadmap and considerations for defining success of implementation of Big Data in the enterprise. This presentation also provides a quick overview of the different types of Big-Data databases
NOSQL Database Engines for Big Data Managementijtsrd
We are living in the digital world and last two decades have seen significant expansion in the information on internet technology. In present digital world the IOT is most popular term means computers, mobile phones and physical devices like sensors are connected to internet. With the rapid outreach of internet it is very important to focus on technological advancements for managing huge amount of data with easy access. Mrs. Yasmeen "NOSQL Database Engines for Big Data Management" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-6 , October 2018, URL: http://www.ijtsrd.com/papers/ijtsrd18608.pdf
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGijiert bestjournal
An unstructured data poses challenges to storing da ta. Experts estimate that 80 to 90 percent of the d ata in any organization is unstructured. And the amount of uns tructured data in enterprises is growing significan tly� often many times faster than structured databases are gro wing. As structured data is existing in table forma t i,e having proper scheme but unstructured data is schema less database So it�s directly signifying the importance of NoSQL storage Model and Map Reduce platform. For processi ng unstructured data,where in existing it is given to Cassandra dataset. Here in present system along wit h Cassandra dataset,Mongo DB is to be implemented. As Mongo DB provide flexible data model and large amou nt of options for querying unstructured data. Where as Cassandra model their data in such a way as to mini mize the total number of queries through more caref ul planning and renormalizations. It offers basic secondary ind exes but for the best performance it�s recommended to model our data as to use them infrequently. So to process
Characterization and the Kinetics of drying at the drying oven and with micro...Open Access Research Paper
The objective of this work is to contribute to valorization de Nephelium lappaceum by the characterization of kinetics of drying of seeds of Nephelium lappaceum. The seeds were dehydrated until a constant mass respectively in a drying oven and a microwawe oven. The temperatures and the powers of drying are respectively: 50, 60 and 70°C and 140, 280 and 420 W. The results show that the curves of drying of seeds of Nephelium lappaceum do not present a phase of constant kinetics. The coefficients of diffusion vary between 2.09.10-8 to 2.98. 10-8m-2/s in the interval of 50°C at 70°C and between 4.83×10-07 at 9.04×10-07 m-8/s for the powers going of 140 W with 420 W the relation between Arrhenius and a value of energy of activation of 16.49 kJ. mol-1 expressed the effect of the temperature on effective diffusivity.
UNDERSTANDING WHAT GREEN WASHING IS!.pdfJulietMogola
Many companies today use green washing to lure the public into thinking they are conserving the environment but in real sense they are doing more harm. There have been such several cases from very big companies here in Kenya and also globally. This ranges from various sectors from manufacturing and goes to consumer products. Educating people on greenwashing will enable people to make better choices based on their analysis and not on what they see on marketing sites.
WRI’s brand new “Food Service Playbook for Promoting Sustainable Food Choices” gives food service operators the very latest strategies for creating dining environments that empower consumers to choose sustainable, plant-rich dishes. This research builds off our first guide for food service, now with industry experience and insights from nearly 350 academic trials.
Willie Nelson Net Worth: A Journey Through Music, Movies, and Business Venturesgreendigital
Willie Nelson is a name that resonates within the world of music and entertainment. Known for his unique voice, and masterful guitar skills. and an extraordinary career spanning several decades. Nelson has become a legend in the country music scene. But, his influence extends far beyond the realm of music. with ventures in acting, writing, activism, and business. This comprehensive article delves into Willie Nelson net worth. exploring the various facets of his career that have contributed to his large fortune.
Follow us on: Pinterest
Introduction
Willie Nelson net worth is a testament to his enduring influence and success in many fields. Born on April 29, 1933, in Abbott, Texas. Nelson's journey from a humble beginning to becoming one of the most iconic figures in American music is nothing short of inspirational. His net worth, which estimated to be around $25 million as of 2024. reflects a career that is as diverse as it is prolific.
Early Life and Musical Beginnings
Humble Origins
Willie Hugh Nelson was born during the Great Depression. a time of significant economic hardship in the United States. Raised by his grandparents. Nelson found solace and inspiration in music from an early age. His grandmother taught him to play the guitar. setting the stage for what would become an illustrious career.
First Steps in Music
Nelson's initial foray into the music industry was fraught with challenges. He moved to Nashville, Tennessee, to pursue his dreams, but success did not come . Working as a songwriter, Nelson penned hits for other artists. which helped him gain a foothold in the competitive music scene. His songwriting skills contributed to his early earnings. laying the foundation for his net worth.
Rise to Stardom
Breakthrough Albums
The 1970s marked a turning point in Willie Nelson's career. His albums "Shotgun Willie" (1973), "Red Headed Stranger" (1975). and "Stardust" (1978) received critical acclaim and commercial success. These albums not only solidified his position in the country music genre. but also introduced his music to a broader audience. The success of these albums played a crucial role in boosting Willie Nelson net worth.
Iconic Songs
Willie Nelson net worth is also attributed to his extensive catalog of hit songs. Tracks like "Blue Eyes Crying in the Rain," "On the Road Again," and "Always on My Mind" have become timeless classics. These songs have not only earned Nelson large royalties but have also ensured his continued relevance in the music industry.
Acting and Film Career
Hollywood Ventures
In addition to his music career, Willie Nelson has also made a mark in Hollywood. His distinctive personality and on-screen presence have landed him roles in several films and television shows. Notable appearances include roles in "The Electric Horseman" (1979), "Honeysuckle Rose" (1980), and "Barbarosa" (1982). These acting gigs have added a significant amount to Willie Nelson net worth.
Television Appearances
Nelson's char
2. 2 AUTHOR (All CAPS)
• To monitor the performance of the Riak KV
(Throughput, Latency) when data is being
read, write and update operation.
The rest of this paper is organized as follows:
Section 2 we present background and basic concepts.
Section 3 takes a deeperlookat related works. Section
4 provides an overviewofRiak KV NoSQL databases
system and its infrastructure. Section 5 is about the
Basho bench benchmarking of Riak KV. Section 6
presents the experiment environment for testing the
Riak KV NoSQL database with the Basho bench.
Section 6 provides our experimental results and
discussion. Section 7 concludes the paper.
2 BACKGROUND AND BASIC CONCEPTS
IN this section,the basic conceptsrelated to the big
data, NoSQL database properties will be introduced.
The challengesassociatedwith big data and NoSQLare
also introduced.
2.1 Big data
In this part, we will describe the termbig data that is
very related with NoSQL database systems. Big data
can be defined as the capability of managing a huge
volume of data within the right time and properspeed.
Big data is an evolving term that describes any
voluminous amount of structured, semi structured and
unstructured data thathas the potentialto be mined for
information,which cannot be managedusing relational
database management systems (RDBMs). [6,8]
Every day, new data is created from a variety of
sources,includingsocialnetworks,photos,videos,and
more. Due to the rapid growth of data, it has become
very difficult to process this data through the available
database managementsystem.One ofthe solutions that
have beenproposedto overcome thefastgrowth ofdata
has been applying better hardware; however, this
approach has not been sufficient as the hardware
enhancement reached a point where the growth ofdata
volume outpacescomputerresources[5].Now,big data
could be found in three forms:
Structured-Anydata thatcan bestored,accessed
and processed in the form of fixed format is
termed as a 'structured'data.Overtime, talent in
computer science have achieved greater success
in developing techniques for working with such
kind of data (where the format is well known in
advance)and also derivingvalue out ofit.There
are two sourcesthat providestructureddata:data
generatedby human intervention suchas gaming
data and inputdata.The second sourceis the data
generated by machines such as sensor data, web
log data and financial data. [8,9]
Unstructured data-Before the current ubiquitous
of online and mobile applications, databases
processeddirect,structured data.The data forms
were almost simple and described a set of
relationships between various data types in the
database. In contrast, unstructured data refers to
data that is not fully suited to the traditional
column and row structure of a relational
database. In today's big data world, most of the
data createdare unstructured,and some estimates
that it is more than 95% ofalldata generated.[29]
Semi-structured data- This data combines
structured and unstructured data. Dealing with
this levelofdata complexity is not easy.Big data
and extensive records lead to long-running
queries; So, we need new methods and
techniques to overcome this challenge and
manage large amounts of data.[14]
2.2 Nosql
The term NoSQL ("Not only SQL") is the term that
describes the entire class of databases which do not
have the characteristics of traditional relational
databases and forwhich standard query SQL language
is not generally used.NoSQLdatabases are considered
to be the next generationdatabases andIt supportshuge
data storage, horizontally scalable, open source,
distributed databases and massive- parallel data
processing.Theyare characterized bya less strict static
data structure,simple support to replicationand simple
application programming interface. They are often
related to large data sets that need to be quickly and
efficiently accessed andchanged on the Web. [11,10]
NoSQL databases can be classified into four
categories.
Key-Value (KV)- In general, NoSQL
databases allow the use of various types of
relational data tools. These are becoming
common in new business plans and big data
analysis in which classified data should be
stored in a practicalandefficient manner[16].
Within this context,key-value store databases
are the simplest NoSQL databases. They can
help developers in the absenceofa predefined
schema.Different kinds ofobjects,datatypes,
and data containers and are used to
accommodate this [17,15].High query speed
with a simple structure,where KV is the data
model, supports benefits such as high
concurrency and mass storage. Data
modification and query operations are well
supportedthroughprimary keys,suchas Riak
KV [18] and Redis [19].
Column-oriented- A table in a column-
oriented database can be used for the data
model; however, This stores tables of
extensible records. It includes columns and
rows, which may be shared through being
divided overnodes. In general,the benefit of
this data model is a more appropriate
application on aggregation and data
3. INTELLIGE NT AUTOMATIO N AND SOFT COMP UTING 3
warehouses, HBase [20] and Cassandra [21]
are an example of this kind of data store.
Documents data stores- Also known as a document-
oriented database, this programis used to retrieve, store
and manage information. The data is semi-structured
data. The documents database can usually use the
secondary index to facilitate the value of the upper
application; however. The Key Value and document
database structures are very similar, they differ in how
they processdata.It was named by that name fr from the
manner of storing. So that the data is stored documents
in XML or JSON format [22,23]. Couch and MongoDB
dB [24] are examples of documents data.
GraphDatabases-A graphdatabase comprises nodesthat
are connected by edges. Data can be stored in edges and
nodes. One advantage of a graph database is that it can
traverse relationships very quickly. Similar to the other
three types ofNoSQL databases mentioned above,graph
databases have some problems with horizontal scaling.
This is why every node can connect to any other node.
Traversing nodeson variousphysicalmachines can have
a negative effect on performance. Another difference
from the above three is that most graphics databases
support ACID (atomicity, consistency, isolation, and
durability)transactions.Graph databases are oftenusedto
dealwith complexissuessuch associalnetworks orpath-
finding problems [25], such as Neo4j [26].
3 RELATED WORK
THERE have been numerous papers, researches, blogs, that
test and evaluate NoSQL database to discuss various features
such as their benefits, and find the suitable NoSQL database,
such as that by Ali Hammood et al. [9], this research examines
the more recent versionsofthesystems.Forthis purpose,was set
up a testing environment for each workload and monitor the
responses for the Cassandra, HBase, and MongoDB database
systems.accordingto the resultsobtained,HBase andCassandra
worked very well under heavy loads. MongoDB worked very
well with low throughput,but not as well with high throughput.
In the read operation, HBase has lower performance. And the
latency for them is lower than before for all operations,
particularly in MongoDB. Lazar J. Krstić et al. [11], was used
YCSB tool fortesting the performanceoffive NoSQLdatabases:
BrightstarDB,LevelDB, HamsterDB,RavenDB and STSdb
4.0.Database benchmark, a tool that was used to perform the
measurement itself, was selected to manage the NoSQL
databases running in various ways at approximately the same
level, so that the obtained measurement results could be almost
realistically compared.The authors reported that HamsterDBhas
the best performance, while the worst is BrightstarDB. This
conclusion was expected before the start of the actual
performance measurement.In the study by Kuldeep Singh [12],
compared Riak, HBase,Cassandra andmongo dBfromdifferent
views.the experiments to compare andevaluatethe performance
of different NoSQL datastores on a distributed clusterusedycsb
to test theperformances ofthesefoursystems usingthesame test
environment andapplying different workloads on thesesystems.
A summary of the results of this thesis concluded that each
systemhas a different response whenapplyinga workload due to
the differences in designs.
Abramova et al. [13], tested the performance of
Cassandra based on a numberof factors,including the
number of nodes, workload characteristics, number of
threads,and data size,and analyzed whether it provides
the desired acceleration and scalability attributes.
Scaling nodes and the number of data-sets do not
guarantee performance. However, Cassandra handles
concurrent request threads well and extends well with
concurrent threads. A summary of the results of that
paper concluded that when the number of nodes in a
cluster has increased from 1, or 3 to 6, even for
relatively large data sets,this trendcannot guarantee an
improvement in performance.
The authors of[29] showed a method and the results ofa
research that selected between three NoSQL databases
systems for a large, distributed healthcare society. The
performance assessment methods andresults are displayed
to the following databases: MongoDB, Cassandra and
Riak. The test was based on the YCSB benchmark for
evaluating NoSQLdatabases.The paper's summary ofthe
results concludes that theCassandradatabase providesthe
best throughput performance with the highest latency.
4 RIAK KEY-VALUE (KV)
RIAK is an open-source enterprise version of Riak
Enterprise DS. It is a KV database developedby Bashoin
2007 and written in Erlang and C. The enterprise version
adds multi-data center replication, monitoring, and
additional support [22].
Riak KV is a distributed NoSQL database that is
extremely scalable,available,and straightforwardto work
with.It automatically assigns the data in a clustertoensure
quick performance and fault tolerance. Riak Enterprise
includes multi-cluster replication that guarantees low
latency and strong business continuity. Riak KV is an
appropriateddistributedNoSQLKV databasethatensures
read and write functions even in cases ofhardware failure
or network partitions by supporting both local and multi-
clusterreplication.Riak KV is designed to workand deal
with an assortment of difficulties confronting big data
applications that incorporate following client or session
data,storing data fromconnected devices,and replicating
data aroundthe world.It is designedwith KVto provide a
powerful, simple data model to store large amounts of
unstructured data [22,18].
Riak KV achieves fastperformance androbustbusiness
continuity by automating data distribution across the
cluster, where there is easily added capacity without a
large operationalburdenwith a masterless architecture that
guarantees high availability and scales that are nearly
linear using commodity hardware [18]. Nodes in Riak
form a cluster. This cluster is isolated into partitions and
4. 4 AUTHOR (All CAPS)
virtual nodes (Vnodes) to form a ring to obtain all the
benefits of Riak. The ring is a 160 bit integer space
separated into a similarly sized partitions, as shown in
Figure 1.
Figure 1. Architectureofthe Riak cluster.
Each node (also called a physicalnode)in the ring runs
a certain number of virtual nodes (Vnodes). Each
Vnode occupies one partition in the ring.It defines the
partition size of the ring when configuring Riak or
when the cluster is initialized [27].
5 BENCHMARKING OF NOSQL
THE Basho-bench is a benchmarking tool was
created to conduct accurateandrepeatable performance
tests and stress tests and produce performance graphs.
Originally developed to benchmark Riak, it exposes a
pluggable driver interface and has been extended to
serve as a benchmarking tool across a variety of
projects. Basho-bench focuses on two metrics of
performance throughput and latency [28].
How Does the Benchmark Work?
Each node can be either a traffic generator or a Riak
node.A traffic generatorrunsonecopyofBasho-bench
that generates and sends commands to Riak nodes. A
Riak node contains a complete and independent copy
of the Riak package which is identified by an Internet
Protocol (IP) address and a port number. Figure 2
shows how traffic generators and Riak nodes are
organized inside a cluster.There is one traffic generator
for every three Riak nodes [4].
Figure 2. Riak Nodesand Traffic Generatorsin Basho-bench.
Appendix
6 EXPERIMENT ENVIRONMENT
6.1 Experimental Setup
IN this part, we will introduce the results of
experiments realized by the testing of the Riak KV
NoSQL database with the Basho-bench. The
benchmark is specifically designed for Riak
performance test andanalysis. Riakbenchmarkis done
using the Basho ́s measurement software that defines
the number of transactions per seconds executed per
second. The benchmark needs a configuration file,
which contains the required parameter to begin the
benchmark. It executes the given number of workers
that togetherperformthe given task.The test was done
with a different number of keys (10 K,100 K,1000 K,
10,000 K, and 200,000 K), and the fixed size of 10000
KB every key.
The experiments were performed in the following
environment using 5 nodes of the cluster with 16 GB
RAM,Intel®-Xeon(R)-CPU E3 1241 v3-@ 3.50 GHz
× 8 processor speed and 1TB of ephemeral storage in
each unit.Ubuntu 14.0.4 LTS (64-bit) was installed on
each unit.Figure 3illustrates the experimentalstructure
containing details of the primary components.
5. 5 AUTHOR (All CAPS)
Figure 3. Experimental structure.
6.2 Performance Configuration
THE Basho-benchis a test toolto performreads,updates and
writes based on workload and measure performance. The
possible operations that the driver will run, such as
[{get,4},{put,4},{delete, 1}], which means that out of every 9
operations, get will be called four times, put will be called four
times,and delete will be called once,on average.The benchmark
package gives a set of predetermined experiment s that can be
executed as follows:
Experiment#A-Updatesare heavy.It consists ofa 1/1
proportion of reads and updates.
Experiment #B- Reads mostly. It consists of a 9/1
proportion of reads/updates.
Experiment#C-Reads only.The workload is 1/ read.
To evaluate theloadingtime,we generated a different numberof
keys (10 K,100 K,1000 K, 10,000 K, and 200,000 K), and a
varying number of threads (4, 8, and 12).
7 EXPERIMENTAL RESULTS AND DISCUSSION
In the following, we assign a section to each experiment,
which describesthe differentscenario experiments between read
and an update, also the results are illustrated in that.
7.1 Experiment #A: Updates are heavy. It consists of a 1/1
proportion of reads and updates. Figure. 3 shows the results.
• Throughput Result
Figure 4. Throughput performancefor experiment(A)(1/1
read/update).
We notice fromthe figure 4 with thread 8that when
the numberofkeys in the clusterincreased from100 K
to 1000 K, the throughput performance was similar
(190 operation).However,whenthe numberof keys is
10 K, the performance was high (250 operations).The
overallcase if thread 12, performance was very high in
all records compared to other threads.
• Latency Results
Latency is the delay from the input systemto the
desired result; in each case, the term is understood
slightly differently,andthe latencyproblems varyfrom
system to different. Latency greatly affects the
enjoyable and usable of electronic and mechanical
10k 100k 1000k 10,000k 200,000k
Thread 4 340 320 200 30 20
Thread 8 250 190 190 60 10
Thread 12 590 400 420 80 100
340
320
200
30 20
250
190 190
60
10
590
400
420
80
100
0
50
100
150
200
250
300
350
400
450
500
550
600OPERATIONSSEC
6. 6 AUTHOR (All CAPS)
equipment as well as communications.Fromthe figure
5, observe that thethree caseshavea high latencyin the
process of data update.
This is expected because the reading process usually
does not have a great latency like the rest of the
operations. Where the highest value in threads 4
reached the latencyrate to 66ms,and was almost equal
to the other threads 8, 12.
Figure 5. Latency for experiment (A)(1/1 read/update).
7.2 Experiment # B: Our second experiment updates
are heavy. It consists of a 9/1 proportion of reads and
updates. Figure. 6 shows the results.
• Throughput Result
The experimental results are shown and analyzed
are illustrated in Figure 6.
Figure 6. Throughput performancefor experiment(B)(9/1
read/update).
The performance behavior exhibited in experiment (A)
differed from the experiment conducted in the
experiment (B) (9 operation read, 1 operation update).
Moreover, the throughput performance in experiment
(B) was higher than the throughput performance of
experiment (A) in all threads. Furthermore, the
performance decreases whenwe increasethe numberof
threads, for example, for a number of keys 10,000 to
20,000 K with threads 4 and 8, The difference in the
number of operations was the not expected.
As can be seen fromthe figure, the numberof keys has
a significant effect on the performance of Riak KV. For
example, in Figure 6, the numberofoperations usingthe
10 K keys and 200,000 K with threads 12 are 710
operations/sec and 20 operations/sec, respectively,
which is very large.
• Latency Result
From figure 7, the results here were different from
experiment (C). The latency was high in figure 7 (a),
where the numberofrecords reached 10,000K to about
44 ms and 70 ms with 200,000 K keys.
0.5 2 2 1
3
1
5
8
11
22
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
10k 100k 1000k 10.000k 200.000k
mean-get mean-update
(ms)
1 2 3
9
54 5 4
55
66
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
10k 100k 1000k 10.000k 200.000k
mean-get mean-update
(ms)
2 2 3
5 5
7 7 6
20
22
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
10k 100k 1000k 10.000k 200.000k
mean-get mean-update
(ms)
(a) (b) (c)
10k 100k 1000k 10.000k
200.000
k
Thread 4 580 400 300 280 230
Thread 8 385 390 300 277 210
Thread 12 710 470 410 70 20
580
400
300 280
230
385 390
300
277
210
710
470
410
70
200
50
100
150
200
250
300
350
400
450
500
550
600
650
700
750
800
OPERATIONSSEC
7. INTELLIGE NT AUTOMATIO N AND SOFT COMP UTING 7
Figure 7. Latency for experiment (B)(9/1 read/update).
7.3 Experiment # C: Read-only. This experiment read ratio is
100%. The results are shown in Figure 8.
• Throughput Result
The throughput performance of experiment (C) of 100% read
is shown in figure 8. The increase of read keys decreases the
throughput, which confirms again that the number of keys has
significant effect. For example, figure 8 shows the number of
operationsusing 100Kkeys as being625operations/sec and when
the numberof keys reaches 200,000 K, the numberof operations
increased to 475 operations/sec.
Figure 8 shows the number of operations for 200,000 K as 320
operations/sec,thus making it less efficient for thread 8, and also
less throughput performance compared to other threads. In
general, and through the figure of the read-only experiments, the
performance was high and stable in all numberofkeys,compared
to other experiments (A, B).
Figure 8. Throughput performancefor experiment(C)(100% read).
•. Latency Result
Figure 9 shows the Latency for experiment (C) read
operation only, we note that with the increase of the
threads that was caused by the reduction of latency,
through with thread 4, the latency was high where the
highest value 12 ms with 200,000K keys. The result
shown thelatency was almost equalin allnumberofkeys
from 10 K to 200,000 K, while the threads were
performing read-only operations.
Figure 9. Latency for experiment (C)(100% read).
In the summary of the previous 3 experiments A, B and
C, we note that the increase in the number of the thread
has had a significant effect on the performance of Riak
KV NoSQL databases, increasing the number of the
thread increases the performance. But its performance
measures varies from one experiment to another, we
notice the throughput effect of the operations of update
and read when they were equal as in experiment (A), so
that theywere lowcomparedto otherexperiments.Figure
10 shows the throughput comparison in the previous 3
experiments.
10k 100k 1000k 10.000k 200.000k
Thread 4 625 550 510 500 475
Thread 8 420 400 380 360 320
Thread 12 680 650 610 475 470
625
550
510 500
475
420 400 380 360
320
680
650
610
475 470
0
50
100
150
200
250
300
350
400
450
500
550
600
650
700
750
800
OPERATIONSSEC
3 3
7
9
12
2
3 3
5
9
2
3
4
5
8
0
5
10
15
1 2 3 4 5
Thread 4 Thread 8 Thread 12
(ms)
2 3
7
2 44 6
30
44
70
-10
20
50
80
10k 100k 1000k 10.000k 200.000k
mean-get mean-update
(ms)
2 3 5
9 11
6 7 8
22
30
-10
20
50
80
10k 100k 1000k 10.000k 200.000k
mean-get mean-update
(ms)
2
6
9 11 12
2
7 9
12
22
-10
20
50
80
10k 100k 1000k 10.000k 200.000k
mean-get mean-update
(ms)
(a) (b) (c)
8. 8 AUTHOR (All CAPS)
Figure 10. Comparing thethroughput of three experiment A, B and C.
8 CONCLUSION
IN this paper, we tackle analysis and evaluation of
the read/update throughput aswellas the latency ofRiak
KV NoSQL database management systems cluster
environment.To achieve this goal,Basho-bench is used.
Benchmarking the NoSQL data stores in the perspective
of the cluster environment and monitor factors such as
throughput, latency are important requirements as there
exists a difference of NoSQL databases and its utility
differs from one application to another. In addition,
system performance is still an important factor when
processing large amountsofdata.We did measurements
on three experiments ofa different numberofoperations;
experiment A,B and C.We measuredthe readthroughput
and latency of each of the experiments, and the update
throughput and latency. We found that the performance
is affected significantly by increased data size. We also
found that with the increasein the numberofthreads,the
throughput performance is better and the latency factor
reduced.
REFERENCES
[1] Rakesh Kumar, Shilpi Charu, Somya
Bansal.”Effective Way to Handling Big Data Problems
using NoSQL Database (MongoDB)”. Journal of
Advanced Database Management & Systems ISSN:
2393-8730 (online) Volume 2, Issue 2 .2015.
[2] Rakesh K. Lenka and et al.,”Comparative
Analysis ofSpatialHadoopandGeoSparkforGeospatial
Big Data Analytics”, Published in: 2016 2nd
International Conference on Contemporary Computing
and Informatics (IC3I). Date of Conference: 14-17 Dec.
2016.
[3] Anasuya N Jadagerimath1 and Dr. Prakash. S.
“Efficient IoT Data ManagementforCloud Environment
using MongoDB”.Proc. ofInt. Conf.on Current Trends
in Eng., Science and Technology, ICCTEST .2017
[4] Amir Ghaffari ,Natalia Chechina,PhilTrinder,Jon
Meredith (Sep 2013) Scalable Persistent Storage for
Erlang: Theory and Practice, Twelfth ACM SIGPLAN
Workshop on Erlang, Boston, MA, USA.
[5] “Challenges and Opportunities with Big Data”.
CRA.org. Retrieved Jan 2016.
[6]. "Big data fordummies",Dr. Fern Halper,Marcia
Kaufman, Judith Hurwitz, Alan Nugent 2013.
[7] Raj R. Parmar and Sudipta Roy.”MongoDBas an
Efficient Graph Database: An Application of Document
Oriented NOSQL Database”. Data Intensive Computing
Applications for Big Data.2018
[8]
https://www.webopedia.com/TERM/B/big_data.html
[9] A Comparison of NoSQL Database Systems: A
Study on MongoDB, Apache Hbase, and Apache
Cassandra
[10] NoSQL Databases: Critical Analysis and
Comparison
0
200
400
600
800
1000
1200
1400
1600
1800
2000
operationssec
NUMBER Of KEYS
12 Thread
8 Thread
4 Thread
A B C
10 K
A B C
100 K
A B C
1000 K
A B C
10,000 K
A B C
200,000K
9. INTELLIGE NT AUTOMATIO N AND SOFT COMP UTING 9
[11] TESTING THE PERFORMANCE OF NoSQL
DATABASES VIA THE DATABASE BENCHMARK
TOOL
[12] Survey ofNoSQLDatabase Engines forBig Data
[13] V. Abramova,J. Bernardino,P. Furtado.(2014).
Which NOSQL database? A performance overview. In
Paper presented at Open Journal Databases, Volume 1,
Issue 2, pp. 17-24.
[14]
https://www.techopedia.com/definition/28802/semi-
structured-data
[15] Jing Han, Haihong E, Guan Le,Jian Du. Survey
on NoSQL Database. (2011). In IEEE 6th International
Conference on Pervasive Computing and Applications
(ICPCA).
[16] Asadulla Khan Zaki. (2014). NoSQL databases:
new millennium database for big data, big users, cloud
computing and its security challenges. IJRET:
International Journal of Research in Engineering and
Technology. Volume: 03 Special Issue.
[17] Techopedia [Online]. 2018, Retrieved from:
https://www.techopedia.com/definition/26284/key-
value-store.
[18] Riak-kv database[Online].2018,Retrieved from:
http://basho.com/products/riak-kv/
[19] Redis database [[Online]. 2018, Retrieved from:
https://redis.io/ .
[20] Hbase database [Online]. 2018, Retrieved from:
http://hbase.apache.org/.
[21] Cassandra database [Online]. 2018, Retrieved
from: http://cassandra.apache.org/.
[22] Man Qi. Digital Forensics and NoSQL
Databases. (2014). In IEEE 11th International
Conference on Fuzzy Systems and Knowledge
Discovery.
[23] Jing Han, Haihong E, Guan Le,Jian Du. Survey
on NoSQL Database. (2011). In IEEE 6th International
Conference on Pervasive Computing and Applications
(ICPCA).
[24] MongodB database [Online]. 2018, Retrieved
from: https://www.mongodb.com/.
[25] Man Qi. Digital Forensics and NoSQL
Databases. (2014). In IEEE 11th International
Conference on Fuzzy Systems and Knowledge
Discovery.
[26] Neo4j database [Online].2018, Retrieved from:
https://neo4j.com/
[27]Yousaf Muhammad. (2011). Evaluation and
Implementation of Distributed NoSQL Database for
MMO Gaming Environment. Uppsala University,
Retrieved from:
http://uu.divaportal.org/smash/get/diva2:447210/FUL
LTEXT01.pdf.
[28] https://github.com/basho/basho_bench.
[29] John Klein, Ian Gorton, Neil Ernst, Patrick
Donohoe, Kim Pham, and Chrisjan Matser. (2015).
Performance Evaluation of NoSQL Databases: A Case
Study. In Proceedings of the 1st Workshop on
Performance Analysis ofBig Data Systems (PABS ’15).
ACM, New York, NY, USA, pp. 5-10.