SlideShare a Scribd company logo
SQL & No-SQL (Hadoop?)
- Big Data Adoption & Success in the
Enterprise
- By Anita Luthra 2016
MYTHS ABOUT HADOOP
Myth Actual
Hadoop Is A Single Product Hadoop is an ecosystem, not a single product
Apache Hadoop open
source
Multiple vendors have implementations of
Hadoop as well to add usability tools, ease of
use functionality
Hadoop is a Data
management system
Hadoop is actually file-system based
Hive resembles SQL Is not standard SQL. Instead, Hadoop uses
Apache Hive and HiveQL, a SQL-like language
Hadoop requires
MapReduce
Hadoop can survive by itself. They complement
each but are not required
MYTHS ABOUT HADOOP - II
Myth Actual
Hadoop and Big Data are
synonymous
Hadoop isn’t the only answer. products
from Teradata, Sybase IQ (now owned by
SAP) and Vertica (now owned by
Hewlett-Packard).
Hadoop is used for Data
Volume
Hadoop is for data diversity, not just data
volume. Go back to the V3 or V4 (Volume,
Variety, Velocity, Veracity)
Hadoop replaces the Data
Warehouse
It is a complement to the data
warehouse, another tool in the arsenal,
not a replacement
Hadoop is for Web
Analytics
Hadoop enables many types of analytics,
not just Web analytics
Big Data Requires Hadoop Big data does not require Hadoop, there
SQL DATABASES & NOSQL
Traditional OLAP/OLTP Limitations:
1. E.g., A structured database needs to know what is
being stored in advance.
2. The Agile development approach does not work well.
Each time new features are added, the schema of the
database requires changes.
3. If the database is large, the process is slow.
4. Rapid iterations and frequent data changes result in
frequent downtime.
SQL VS. NOSQL
5. Relational databases are not designed to cope
with the scale and agility challenges of modern
applications. They are not built to take advantage
of today’s available cheap storage and
processing power.
NOSQL ADVANTAGES
1. NoSQL databases allow insertion of data without a
predefined schema.
2. Application changes in real-time are easier, resulting in
faster development.
3. Code integration is more reliable, and less database
administration is needed.
4. NoSQL provides the ability to handle a variety of
database technologies. It was developed in response
to handling volume of data, frequency in which this
data is accessed, performance and processing needs.
NO-SQL ADVANTAGES - II
5. NoSQL databases are more scalable and provide superior
performance. The data model addresses issues not addressed
by traditional relational model databases such as:
Ability to handle large volumes of structured, semi-
structured, and unstructured data
Ability to handle Agile sprints, quick iterations, and
frequent code pushes
Object-oriented programming that is easy to use and
flexible
Efficient, scale-out architecture instead of expensive,
monolithic architecture.
WHEN TO USE BIG DATA TOOLING
Users want to interact with their data: totality, exploration,
and frequency. Totality refers to the increased desire to
process and analyze all available data, rather than
analyzing a sample of data and extrapolating the results.
However:
Apache Hadoop does not replace the data warehouse
and NoSQL databases do not replace transactional
relational databases.
Neither do MapReduce, nor streaming analytics Hive—
Apache’s data warehousing application is used to query
Hadoop data stores
WHEN NOT TO USE HADOOP
SAMPLE TOOLING USAGE -- MONGODB
Boxed Ice is an example of a company working with big data
technology— a NoSQL database called MongoDB.
Headquartered in London, Boxed Ice offers a hosted software
product called Server Density that monitors the health of cloud
computing deployments, servers and websites for about 1,000
clients around the globe—a task that requires copious
amounts of data processing.
“We monitor quite a few websites and servers for customers
like EA, Intel and The New York Times,” said David Mytton,
Boxed Ice’s CEO and founder. “We are processing about 12
terabytes of data every month with MongoDB, and that
equates to about 1 billion documents each month.”
ZION’S BANCORPORATION SOLUTION
 Zion’s Bancorporation (Zion’s), launched a fraud analytics
program nine years ago. Cracking the big data code is a moving
target requiring advanced technology and keen intellect.
Problem Statement
1. With exponential growth of data volume over the last decade, finding
needles of useful information in haystacks of data was a formidable task.
2. Data needed to be captured while in motion, in order to be truly effective.
Solution Approach
 Used to handle “big data” challenge for fraud analytics: Zion used non-
traditional data tools such as Hadoop, NoSQL Database, MapReduce for
Analytics, for defining big data computing.
Outcome
 Zion’s bank fraud and security analytics team has continually built and refined
statistical models that have helped bank executivespredict, identify,evaluate
and, when necessary, react to suspicious activity
WHY DIFFERENT TYPES OF NOSQL DATABASES?
1. CAP theorem, aka Brewer's Theorem.
You can provide only two out of the following three
characteristics: consistency, availability, and partition
tolerance.
 Different datasets and different runtime rules require trade-offs.
 Different database technologies focus on different trade-offs.
 The complexity of the data and the scalability of the system also
come into play.
2. Basic computer science or even more basic mathematics.
 Some datasets can be mapped easily to key-value pairs.
I.e., Flattening the data does not make it less meaningful. No
reconstruction of relationships is necessary.
3. There are datasets where the relationship to other items of
data is as important as the items of data themselves.
SAMPLE NO-SQL DATABASES BY DB TYPE
NO-SQL (AGGREGATE) DATA MODEL: KEY VALUE
Key Value are the simplest NoSQL databases. Every item in the
database is stored as an attribute name (or "key"), together with
its value. Some key-value stores, such as Redis, allow each value
to have a type, such as "integer", which adds functionality.
NO-SQL (AGGREGATE) DATA MODEL: DOCUMENT
Pairs each key with a complex data structure known as a
document. Documents can contain many different key-value
pairs, key-array pairs, or even nested documents.
NO-SQL (AGGREGATE) DATA MODEL: COLUMN N FAMILY
Wide-Column stores are optimized for queries over large datasets, and store
columns of data together, instead of rows. A hash map crossed with a
multidimensional array. Each column contains a row of data ...
Eg., Row
Key,
Grouped by
a row key, a
customer
profile, and
orders for
the
customer
NO-SQL DATA MODEL: GRAPH STORES
 Complex inter-relationships
 Not easily handled in multi-
table joins
 This model is very good for
managing across relationships.
Very good at moving across
relationships between things.
 Just like SQL databases, Graph
databases do ACID.
Other noSQL data models do
not use the ACID concept,
however, Graph databases do.
Used to store information about networks, such as social
connections. Graph stores include Neo4J and HyperGraphDB.
Outlier.
GRAPH DATABASES - EXPLAINED
 Graph databases are based on graph
theory. Graph databases employ
nodes, properties, and edges.
 Nodes represent entities such as people,
businesses, accounts, or any other item
you might want to keep track of.
 Properties -- relate to nodes. E.g.,
"Wikipedia" as one of the nodes, might
tie to properties such as "website",
"reference material", or "word that starts
with the letter 'w'", depending on
aspects of "Wikipedia" pertinent to the
particular database.25
GRAPH DATABASES
 Able to do some interesting queries with Graph databases.
Enterprise Roadmap &
Defining Success
THE ROADMAP: TYPICAL STAGES & MILESTONES TO BIG
DATA ADOPTION IN THE ENTERPRISE
DEFINING SUCCESS IN A BIG DATA PROGRAM
According to Zions and others who use
Hadoop, NoSQL databases and similar tools
that have come to define the era of big data
computing.
Achieving a great return on investment (ROI)
is about
1. Creating the right team
2. Putting a solid business strategy in place
3. Being agile and testing—lots and lots of testing
BIG DATA – WHAT DEFINES SUCCESS II
1. Organizations evaluating big data technologies should
remember to test them in conjunction with their own
data sets or applications, as opposed to using some
arbitrary data set or app a salesperson has on hand.
2. Getting a Big Data environment to work right requires
testing, tuning, keeping up with documentation and
paying attention to what the open source community
has to say.
3. With the right team and strategy in place, big data
technologies like Hadoop, Hive, Pig, Cassandra,
Mahout and others can open up a world of predictive
possibilities

More Related Content

What's hot

What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP Theorem
Rahul Jain
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7
abdulrahmanhelan
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQL
Tony Tam
 
Sql vs nosql
Sql vs nosqlSql vs nosql
Sql vs nosql
Nick Verschueren
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
BADR
 
NoSQL vs SQL (by Dmitriy Beseda, JS developer and coach Binary Studio Academy)
NoSQL vs SQL (by Dmitriy Beseda, JS developer and coach Binary Studio Academy)NoSQL vs SQL (by Dmitriy Beseda, JS developer and coach Binary Studio Academy)
NoSQL vs SQL (by Dmitriy Beseda, JS developer and coach Binary Studio Academy)
Binary Studio
 
NoSQL Slideshare Presentation
NoSQL Slideshare Presentation NoSQL Slideshare Presentation
NoSQL Slideshare Presentation
Ericsson Labs
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
Christopher Foot
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
Dimitar Danailov
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data Modeling
DATAVERSITY
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
Dan Gunter
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big data
Steven Francia
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
Viet-Trung TRAN
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databases
Ashwani Kumar
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?
Venu Anuganti
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
Andraz Tori
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
Venu Anuganti
 
My sql vs mongo
My sql vs mongoMy sql vs mongo
My sql vs mongo
krishnapriya Tadepalli
 

What's hot (20)

What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP Theorem
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQL
 
Sql vs nosql
Sql vs nosqlSql vs nosql
Sql vs nosql
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
SQL vs NoSQL
SQL vs NoSQLSQL vs NoSQL
SQL vs NoSQL
 
NoSQL vs SQL (by Dmitriy Beseda, JS developer and coach Binary Studio Academy)
NoSQL vs SQL (by Dmitriy Beseda, JS developer and coach Binary Studio Academy)NoSQL vs SQL (by Dmitriy Beseda, JS developer and coach Binary Studio Academy)
NoSQL vs SQL (by Dmitriy Beseda, JS developer and coach Binary Studio Academy)
 
NoSQL Slideshare Presentation
NoSQL Slideshare Presentation NoSQL Slideshare Presentation
NoSQL Slideshare Presentation
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data Modeling
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big data
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databases
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
My sql vs mongo
My sql vs mongoMy sql vs mongo
My sql vs mongo
 

Similar to SQL vs NoSQL: Big Data Adoption & Success in the Enterprise

Introduction to Bigdata and NoSQL
Introduction to Bigdata and NoSQLIntroduction to Bigdata and NoSQL
Introduction to Bigdata and NoSQL
Tushar Shende
 
Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...
Sheena Crouch
 
Big Data
Big DataBig Data
Big Data
Kirubaburi R
 
Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869
Edgar Alejandro Villegas
 
Big data and Hadoop overview
Big data and Hadoop overviewBig data and Hadoop overview
Big data and Hadoop overview
Nitesh Ghosh
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
Stephen Alex
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
Stephen Alex
 
tecFinal 451 webinar deck
tecFinal 451 webinar decktecFinal 451 webinar deck
tecFinal 451 webinar deck
Basho Technologies
 
The Recent Pronouncement Of The World Wide Web (Www) Had
The Recent Pronouncement Of The World Wide Web (Www) HadThe Recent Pronouncement Of The World Wide Web (Www) Had
The Recent Pronouncement Of The World Wide Web (Www) Had
Deborah Gastineau
 
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
Facundo Farias
 
MongoDB
MongoDBMongoDB
Analysis and evaluation of riak kv cluster environment using basho bench
Analysis and evaluation of riak kv cluster environment using basho benchAnalysis and evaluation of riak kv cluster environment using basho bench
Analysis and evaluation of riak kv cluster environment using basho bench
StevenChike
 
Unstructured Datasets Analysis: Thesaurus Model
Unstructured Datasets Analysis: Thesaurus ModelUnstructured Datasets Analysis: Thesaurus Model
Unstructured Datasets Analysis: Thesaurus Model
Editor IJCATR
 
Oh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataOh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG Data
Prakalp Agarwal
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKRajesh Jayarman
 
Big data and apache hadoop adoption
Big data and apache hadoop adoptionBig data and apache hadoop adoption
Big data and apache hadoop adoption
faizrashid1995
 
Big data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBig data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edge
Bhavya Gulati
 
Hadoop Overview
Hadoop OverviewHadoop Overview
Hadoop Overview
Gregg Barrett
 

Similar to SQL vs NoSQL: Big Data Adoption & Success in the Enterprise (20)

Introduction to Bigdata and NoSQL
Introduction to Bigdata and NoSQLIntroduction to Bigdata and NoSQL
Introduction to Bigdata and NoSQL
 
Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...
 
Big Data
Big DataBig Data
Big Data
 
Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869
 
Big data and Hadoop overview
Big data and Hadoop overviewBig data and Hadoop overview
Big data and Hadoop overview
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
tecFinal 451 webinar deck
tecFinal 451 webinar decktecFinal 451 webinar deck
tecFinal 451 webinar deck
 
The Recent Pronouncement Of The World Wide Web (Www) Had
The Recent Pronouncement Of The World Wide Web (Www) HadThe Recent Pronouncement Of The World Wide Web (Www) Had
The Recent Pronouncement Of The World Wide Web (Www) Had
 
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
 
No sql database
No sql databaseNo sql database
No sql database
 
MongoDB
MongoDBMongoDB
MongoDB
 
Analysis and evaluation of riak kv cluster environment using basho bench
Analysis and evaluation of riak kv cluster environment using basho benchAnalysis and evaluation of riak kv cluster environment using basho bench
Analysis and evaluation of riak kv cluster environment using basho bench
 
Unstructured Datasets Analysis: Thesaurus Model
Unstructured Datasets Analysis: Thesaurus ModelUnstructured Datasets Analysis: Thesaurus Model
Unstructured Datasets Analysis: Thesaurus Model
 
Oh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataOh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG Data
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
Big data and apache hadoop adoption
Big data and apache hadoop adoptionBig data and apache hadoop adoption
Big data and apache hadoop adoption
 
Big data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBig data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edge
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop Overview
Hadoop OverviewHadoop Overview
Hadoop Overview
 

Recently uploaded

一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 

Recently uploaded (20)

一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 

SQL vs NoSQL: Big Data Adoption & Success in the Enterprise

  • 1. SQL & No-SQL (Hadoop?) - Big Data Adoption & Success in the Enterprise - By Anita Luthra 2016
  • 2. MYTHS ABOUT HADOOP Myth Actual Hadoop Is A Single Product Hadoop is an ecosystem, not a single product Apache Hadoop open source Multiple vendors have implementations of Hadoop as well to add usability tools, ease of use functionality Hadoop is a Data management system Hadoop is actually file-system based Hive resembles SQL Is not standard SQL. Instead, Hadoop uses Apache Hive and HiveQL, a SQL-like language Hadoop requires MapReduce Hadoop can survive by itself. They complement each but are not required
  • 3. MYTHS ABOUT HADOOP - II Myth Actual Hadoop and Big Data are synonymous Hadoop isn’t the only answer. products from Teradata, Sybase IQ (now owned by SAP) and Vertica (now owned by Hewlett-Packard). Hadoop is used for Data Volume Hadoop is for data diversity, not just data volume. Go back to the V3 or V4 (Volume, Variety, Velocity, Veracity) Hadoop replaces the Data Warehouse It is a complement to the data warehouse, another tool in the arsenal, not a replacement Hadoop is for Web Analytics Hadoop enables many types of analytics, not just Web analytics Big Data Requires Hadoop Big data does not require Hadoop, there
  • 4. SQL DATABASES & NOSQL Traditional OLAP/OLTP Limitations: 1. E.g., A structured database needs to know what is being stored in advance. 2. The Agile development approach does not work well. Each time new features are added, the schema of the database requires changes. 3. If the database is large, the process is slow. 4. Rapid iterations and frequent data changes result in frequent downtime.
  • 5. SQL VS. NOSQL 5. Relational databases are not designed to cope with the scale and agility challenges of modern applications. They are not built to take advantage of today’s available cheap storage and processing power.
  • 6. NOSQL ADVANTAGES 1. NoSQL databases allow insertion of data without a predefined schema. 2. Application changes in real-time are easier, resulting in faster development. 3. Code integration is more reliable, and less database administration is needed. 4. NoSQL provides the ability to handle a variety of database technologies. It was developed in response to handling volume of data, frequency in which this data is accessed, performance and processing needs.
  • 7. NO-SQL ADVANTAGES - II 5. NoSQL databases are more scalable and provide superior performance. The data model addresses issues not addressed by traditional relational model databases such as: Ability to handle large volumes of structured, semi- structured, and unstructured data Ability to handle Agile sprints, quick iterations, and frequent code pushes Object-oriented programming that is easy to use and flexible Efficient, scale-out architecture instead of expensive, monolithic architecture.
  • 8. WHEN TO USE BIG DATA TOOLING Users want to interact with their data: totality, exploration, and frequency. Totality refers to the increased desire to process and analyze all available data, rather than analyzing a sample of data and extrapolating the results. However: Apache Hadoop does not replace the data warehouse and NoSQL databases do not replace transactional relational databases. Neither do MapReduce, nor streaming analytics Hive— Apache’s data warehousing application is used to query Hadoop data stores
  • 9. WHEN NOT TO USE HADOOP
  • 10. SAMPLE TOOLING USAGE -- MONGODB Boxed Ice is an example of a company working with big data technology— a NoSQL database called MongoDB. Headquartered in London, Boxed Ice offers a hosted software product called Server Density that monitors the health of cloud computing deployments, servers and websites for about 1,000 clients around the globe—a task that requires copious amounts of data processing. “We monitor quite a few websites and servers for customers like EA, Intel and The New York Times,” said David Mytton, Boxed Ice’s CEO and founder. “We are processing about 12 terabytes of data every month with MongoDB, and that equates to about 1 billion documents each month.”
  • 11. ZION’S BANCORPORATION SOLUTION  Zion’s Bancorporation (Zion’s), launched a fraud analytics program nine years ago. Cracking the big data code is a moving target requiring advanced technology and keen intellect. Problem Statement 1. With exponential growth of data volume over the last decade, finding needles of useful information in haystacks of data was a formidable task. 2. Data needed to be captured while in motion, in order to be truly effective. Solution Approach  Used to handle “big data” challenge for fraud analytics: Zion used non- traditional data tools such as Hadoop, NoSQL Database, MapReduce for Analytics, for defining big data computing. Outcome  Zion’s bank fraud and security analytics team has continually built and refined statistical models that have helped bank executivespredict, identify,evaluate and, when necessary, react to suspicious activity
  • 12.
  • 13. WHY DIFFERENT TYPES OF NOSQL DATABASES? 1. CAP theorem, aka Brewer's Theorem. You can provide only two out of the following three characteristics: consistency, availability, and partition tolerance.  Different datasets and different runtime rules require trade-offs.  Different database technologies focus on different trade-offs.  The complexity of the data and the scalability of the system also come into play. 2. Basic computer science or even more basic mathematics.  Some datasets can be mapped easily to key-value pairs. I.e., Flattening the data does not make it less meaningful. No reconstruction of relationships is necessary. 3. There are datasets where the relationship to other items of data is as important as the items of data themselves.
  • 15. NO-SQL (AGGREGATE) DATA MODEL: KEY VALUE Key Value are the simplest NoSQL databases. Every item in the database is stored as an attribute name (or "key"), together with its value. Some key-value stores, such as Redis, allow each value to have a type, such as "integer", which adds functionality.
  • 16. NO-SQL (AGGREGATE) DATA MODEL: DOCUMENT Pairs each key with a complex data structure known as a document. Documents can contain many different key-value pairs, key-array pairs, or even nested documents.
  • 17. NO-SQL (AGGREGATE) DATA MODEL: COLUMN N FAMILY Wide-Column stores are optimized for queries over large datasets, and store columns of data together, instead of rows. A hash map crossed with a multidimensional array. Each column contains a row of data ... Eg., Row Key, Grouped by a row key, a customer profile, and orders for the customer
  • 18. NO-SQL DATA MODEL: GRAPH STORES  Complex inter-relationships  Not easily handled in multi- table joins  This model is very good for managing across relationships. Very good at moving across relationships between things.  Just like SQL databases, Graph databases do ACID. Other noSQL data models do not use the ACID concept, however, Graph databases do. Used to store information about networks, such as social connections. Graph stores include Neo4J and HyperGraphDB. Outlier.
  • 19. GRAPH DATABASES - EXPLAINED  Graph databases are based on graph theory. Graph databases employ nodes, properties, and edges.  Nodes represent entities such as people, businesses, accounts, or any other item you might want to keep track of.  Properties -- relate to nodes. E.g., "Wikipedia" as one of the nodes, might tie to properties such as "website", "reference material", or "word that starts with the letter 'w'", depending on aspects of "Wikipedia" pertinent to the particular database.25
  • 20. GRAPH DATABASES  Able to do some interesting queries with Graph databases.
  • 22. THE ROADMAP: TYPICAL STAGES & MILESTONES TO BIG DATA ADOPTION IN THE ENTERPRISE
  • 23. DEFINING SUCCESS IN A BIG DATA PROGRAM According to Zions and others who use Hadoop, NoSQL databases and similar tools that have come to define the era of big data computing. Achieving a great return on investment (ROI) is about 1. Creating the right team 2. Putting a solid business strategy in place 3. Being agile and testing—lots and lots of testing
  • 24. BIG DATA – WHAT DEFINES SUCCESS II 1. Organizations evaluating big data technologies should remember to test them in conjunction with their own data sets or applications, as opposed to using some arbitrary data set or app a salesperson has on hand. 2. Getting a Big Data environment to work right requires testing, tuning, keeping up with documentation and paying attention to what the open source community has to say. 3. With the right team and strategy in place, big data technologies like Hadoop, Hive, Pig, Cassandra, Mahout and others can open up a world of predictive possibilities