This ingite length deck talks about why we have seen so much database innovation and the genesis of the NoSQL movement over the last 5 year. While there are many great NoSQL products it speaks to why MongoDB is dominating the space and is the heir apparent to the RDBMS for modern operational data.
2. 2
Dawn of Databases to Present
Brewer’s Cap
bornWWW
born
10gen
founded
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010
SQL
invented
Oracle
founded
PC’s gain
traction
Client
Server
Dynamic
Web Content
3 tier
architecture
Web applications
SOA
Cloud
Computing
released
NoSQL
Movement
BigTable
IDS
(network)
IMS
(hierarchical)
MUMPS
Codd’s paper
IDMS
(network)
3. 3
Big Data
Sensor Data (volume, velocity) Situational Awareness (Variety, Volume)
SIGINT(V ) Asset Management (variety, velocity)
OSINT( 3V )
Social Media ( 3V )
3
Modern Data
4. 4
Relational Database Challenges
Data Types
• Unstructured data
• Semi-structured
data
• Polymorphic data
Volume of Data
• Petabytes of data
• Trillions of records
• Millions of queries per
second
Agile Development
• Iterative
• Short development
cycles
• Changing data model
New Architectures
• Horizontal scaling
• Commodity
servers
• Cloud computing
5. 5
The Evolution of Databases
2010
RDBMS
NoSQL
OLAP/BI
Hadoop
2000
RDBMS
OLAP/BI
1990
RDBMS
Operational
Data
Datawarehouse
Online
Offline
6. 6
Fully Featured NoSQL
Data Model
{
first_name: ‘Paul’,
surname: ‘Miller’,
city: ‘London’,
location: [45.123,47.232],
cars: [
{ model: ‘Bentley’,
year: 1973,
value: 100000, … },
{ model: ‘Rolls Royce’,
year: 1965,
value: 330000, … }
}
}
Rich Queries
• Find Paul’s cars
• Find everybody in London with a car
built between 1970 and 1980
Geospatial
• Find all of the car owners within 5km of
Trafalgar Sq.
Text Search
• Find all the cars described as having
leather seats
Aggregation
• Calculate the average value of Paul’s
car collection
Native Indexes
• Secondary
• Compound
• Geospatial
• Full Text
• Hash
• Covering
Security
• Kerberos
• FIPS 140-2
• Field Level Security
• LDAP
• Auditing
• RBAC
7. 7
Indeed.com Trends
Top Job Trends
1.HTML 5
2.MongoDB
3.iOS
4.Android
5.Mobile Apps
6.Puppet
7.Hadoop
8.jQuery
9.PaaS
10.Social Media
NoSQL Space
LinkedIn Job Skills
MongoDB
Competitor 1
Competitor 2
Competitor 3
Competitor 4
Competitor 5
All Others
Google Search
MongoDB
Competitor 1
Competitor 2
Competitor 3
Competitor 4
Jaspersoft Big Data Index
Direct Real-Time Downloads
MongoDB
Competitor 1
Competitor 2
Competitor 3
What has driven the need to have the term big data?Big data first used in 1997 but refers to just sizeVolume, Variety, Velocity grouped together 2001What do they have in common?They are all difficult to handle with the traditional data stack2011 “Big Data” = 3VsIs variability or velocity big?Not really but big data was a convenient umbrella termShould have called it Awkward DataI’ll never get a job at Gartner
Big Data is born online. Latency for these applications must be very low and availability must be high in order to meet SLAs and user expectations for modern application performance. Offline Big Data encompasses applications that ingest, transform, manage and/or analyze data in a batch context. They typically do not create new data. For these applications, response time can be slow (up to hours or days), which is often be acceptable for this type of use case. Since they usually produce a static (vs. operational) output, such as a report or dashboard, they can even go offline temporarily without impacting the overall goal or end product.
Indeed: #2 just after HTML and ahead of iOS, Android, HadoopJasper: Demand for MongoDB, the document-oriented NoSQL database, saw the biggest spike with over 200% growth in 2011.451 Group: Bigger than next 3 or 4 COMBINED; biggest quarter-over-quarter and year-over-year growth (again)