SlideShare a Scribd company logo
Reinventing the Database
Max Schireson
President, 10gen
My background

At Oracle from 1994 to 2003

At MarkLogic from 2003 to Feb 2011

Join 10gen Feb 2011
The world has changed

                      1970                      2011
Main memory           Intel 1103, 1k bits       4GB of RAM costs
                                                $25.99
                                                $25 99
Mass storage          IBM 3330 Model 1, 100     3TB Superspeed USB
                      MB                        for $129
Microprocessor        Nearly – 4004 being       Westmere EX has 10
                      developed; 4 bits and     cores, 30MB L3 cache,
                      92,000 instructions per   runs at 2.4GHz
                      second
Motor Trend Car of the Ford Torino              Chevy Volt
Year
President             Richard Nixon             Barack Obama
Ted Codd              In his 40’s               Dead
Me                    In diapers                In my 40s
More recent changes

                        A decade ago            Now
Faster                  Buy a bigger server     Buy more servers
Faster t
F t storage             A SAN with more
                                 ith            SSD
                        spindles
More reliable storage   More expensive SAN      More copies of local
                                                storage
Deployed in             Your data center        The cloud – private or
                                                public
Large user base         Thousands -             Millions - consumers
                        employees
Tracking                Business transactions   Every click and more
Assumptions behind todays
         DBMS
Relational data model
Third normal form
ACID
SQL
 Q
Multi-
Multi-statement transactions
Database is hardware agnostic
RAM is small and disks are slow
If its too slow you can buy a faster computer
Yesterday’s assumptions in
      today’s
      t d ’ worldld

Scaleout is hard
  Distributed joins are hard
  Making two-phase commits fast is hard
          two-

Custom solutions proliferate
                 p

Too slow? Just add a cache

ORM t l everywhere
    tools     h

More computers and disk are nearly free but SAN
and f
  d faster computers are expensive
                               i
Challenging some
              assumptions
                      ti
Do you need a database at all

How does it scale out

What type of queries does it need to be able to do

How should it model data

How do you query it

How does it handle transactions and consistency

Is i
I it enterprise software, open source, an appliance, or a cloud service
            i     f                          li            l d      i

Does the data fit in memory?

What if your disks are SSD?
My opinions

Different use cases will produce different answers

Existing RDBMS solutions will continue to solve a
broad set of problems well but many applications
will work better on top of alternative technologies

Many new technologies will find niches but only
one or two will become mainstream
Do you need a database at
          all
           ll
Can you better solve your problem with a batch
processing framework

Can you better solve your problem with an in
memory object store/cache
How does it scale out

Scale-
Scale-out for working set size

Scale-
Scale-out for total data size

Scale out for write volume

Scale-
Scale-out for read volume

Scale-
Scale-out for redundancy

How do you incrementally add nodes or change configuration

How do you trade off query performance (which wants fewer
index segments) for elasticity (which wants more index
segments))
What type of queries does it
  need t b able to d
      d to be bl t do

Is a key/value store enough

Will you be retrieving your data by one key or by
many

Is there a primary way you ll be viewing your data
                       you’ll

Do you need specialized queries (eg, time series,
                                (eg,
geospatial)
Imagine a garage…
You hand your valet the keys to your car

Before they park your car, they completely disassemble it

The pistons are stored in piston storage, brake pads with brake pads, steering
    p                     p           g         p               p            g
wheels with steering wheels

Over time, they have storage areas for catalytic converters, DVD-based nav
                                                             DVD-
systems, headlight washers, and traction control systems

When you ask for your car back, the valet is incredibly fast at reassembly

One minor issue: you have to provide the disassembly and reassembly instructions
and they will be followed literally, even if you say the spare tire should be used as
a steering wheel and forgot to specify re-insertion of spark plugs
                                       re-



A technological marvel

Might be a good way to store your car if you don’t know whether you’ll be asking
for a car back or lots of brake pads or pistons – for a salvage yard?
How should it model data

Relational
  Row oriented or column oriented

Key value

Document oriented

Graph oriented
How do you query it

Do you want an API, a language, or a map-reduce
                                     map-
style interface?

Will most of your queries be hand-typed, embedded
                             hand-
in code or dynamically generated
How do you handle
transactions and consistency
t      ti      d     i t
Do you need transactions at all
  Be careful; web services, for example, need to be able to
  assign userIDs

Do you need multi-master updates
            multi-
  If so, how do y resolve conflicts
       ,        you

Do you need immediate consistency?
  For some queries or all?

How do you handle failures
  Are you optimizing for read availability or write
  availability
What is it

Enterprise software
Open source
 p
  With commercial support?

Appliance
  Packaged with commodity hardware
  Specialized hardware

Cloud
Cl d service
         i
  Available for on-premise deployment?
                on-
  Integrated in another PaaS offering?
  Where on the net?
Does the data fit in
          memory
Transactions can be very very fast

Do you trust enough copies in memory (perhaps
across multiple data centers) or do you require
some sort of sync to persistent storage

How big will the data be and how much do you
care about costs
What if your disks are SSD

Alleviate hotspots

Random accesses are measured in microseconds not
milliseconds

Degradation from in-memory to on-disk can be
                 in-          on-
more graceful
  But data representations on disk vs in memory may be
  very different which may create significant overhead
In choosing a solution

Examine your requirements
  They will dictate certain choices

Once you have narrowed the field
  Prefer solutions that may become mainstream
                          y
  Consider TCO:
    Purchase cost
    Learning curve
    L    i
    Productivity
    Viability
Which solution sets will
    become mainstream
    b          i t
High confidence
  Horizontally scalable: to take advantage of hardware trends
  Non-
  Non-relational: to enable scalability
  Highly functional: for usage beyond mega-scale
                                        mega-
  Developer-
  Developer-friendly: because decision making has shifted
  Freely available: for rapid adoption


My predictions
  Document oriented: enables scalability, functionality,
  developer friendliness, and agility
  Open source: with multiple PaaS providers

More Related Content

What's hot

DDN Product Update from SC13
DDN Product Update from SC13DDN Product Update from SC13
DDN Product Update from SC13
inside-BigData.com
 
IMC Summit 2016 Keynote - Arthur Sainio - NVDIMM: Changes are Here So What’s ...
IMC Summit 2016 Keynote - Arthur Sainio - NVDIMM: Changes are Here So What’s ...IMC Summit 2016 Keynote - Arthur Sainio - NVDIMM: Changes are Here So What’s ...
IMC Summit 2016 Keynote - Arthur Sainio - NVDIMM: Changes are Here So What’s ...
In-Memory Computing Summit
 
Scaling Up vs. Scaling-out
Scaling Up vs. Scaling-outScaling Up vs. Scaling-out
Scaling Up vs. Scaling-out
Christopher Nadeau
 
Partner reseller proposal ppt
Partner reseller proposal pptPartner reseller proposal ppt
Partner reseller proposal ppt
Clayton Desouza
 
4 Ways To Save Big Money in Your Data Center and Private Cloud
4 Ways To Save Big Money in Your Data Center and Private Cloud4 Ways To Save Big Money in Your Data Center and Private Cloud
4 Ways To Save Big Money in Your Data Center and Private Cloud
tervela
 
Life Beyond Flash - ReRAM for embedded and memory-centric architectures in Io...
Life Beyond Flash - ReRAM for embedded and memory-centric architectures in Io...Life Beyond Flash - ReRAM for embedded and memory-centric architectures in Io...
Life Beyond Flash - ReRAM for embedded and memory-centric architectures in Io...
Crossbarinc
 
Adaptec Hybrid RAID Whitepaper
Adaptec Hybrid RAID WhitepaperAdaptec Hybrid RAID Whitepaper
Adaptec Hybrid RAID WhitepaperAdaptec by PMC
 
BM Brings Enterprise Functionality to Mid-Range Storage
BM Brings Enterprise Functionality to Mid-Range StorageBM Brings Enterprise Functionality to Mid-Range Storage
BM Brings Enterprise Functionality to Mid-Range StorageIBM India Smarter Computing
 
Dell SSD og Flash teknologi i SAN
Dell SSD og Flash teknologi i SANDell SSD og Flash teknologi i SAN
Dell SSD og Flash teknologi i SAN
Kenneth de Brucq
 
Benchmarking a Scalable and Highly Available Architecture for Virtual Desktops
Benchmarking a Scalable and Highly Available Architecture for Virtual DesktopsBenchmarking a Scalable and Highly Available Architecture for Virtual Desktops
Benchmarking a Scalable and Highly Available Architecture for Virtual Desktops
DataCore Software
 
White paper whitewater-datastorageinthecloud
White paper whitewater-datastorageinthecloudWhite paper whitewater-datastorageinthecloud
White paper whitewater-datastorageinthecloudAccenture
 
DDN and Intel: Partnered for Exascale
DDN and Intel: Partnered for ExascaleDDN and Intel: Partnered for Exascale
DDN and Intel: Partnered for Exascale
Intel IT Center
 
Evaluating MLC vs TLC vs V-NAND for Enterprise SSDs – Whitepaper
Evaluating MLC vs TLC vs V-NAND for Enterprise SSDs – WhitepaperEvaluating MLC vs TLC vs V-NAND for Enterprise SSDs – Whitepaper
Evaluating MLC vs TLC vs V-NAND for Enterprise SSDs – Whitepaper
Samsung Business USA
 
Ddn Vision
Ddn VisionDdn Vision
Ddn Vision
inside-BigData.com
 
Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything)
Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything)Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything)
Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything)
jlchatelain
 
Significantly Improving Storage Efficiency — IBM Adds Real-time Compression t...
Significantly Improving Storage Efficiency — IBM Adds Real-time Compression t...Significantly Improving Storage Efficiency — IBM Adds Real-time Compression t...
Significantly Improving Storage Efficiency — IBM Adds Real-time Compression t...
IBM India Smarter Computing
 
Scalar, nimble, brocade, commvault, star trek into darkness, toronto, 05 16 2013
Scalar, nimble, brocade, commvault, star trek into darkness, toronto, 05 16 2013Scalar, nimble, brocade, commvault, star trek into darkness, toronto, 05 16 2013
Scalar, nimble, brocade, commvault, star trek into darkness, toronto, 05 16 2013
patmisasi
 
Panzura Global Storage System
Panzura Global Storage SystemPanzura Global Storage System
Panzura Global Storage System
Panzura
 
1f Backup
1f   Backup1f   Backup
1f BackupMISY
 
AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012
AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012
AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012
Amazon Web Services
 

What's hot (20)

DDN Product Update from SC13
DDN Product Update from SC13DDN Product Update from SC13
DDN Product Update from SC13
 
IMC Summit 2016 Keynote - Arthur Sainio - NVDIMM: Changes are Here So What’s ...
IMC Summit 2016 Keynote - Arthur Sainio - NVDIMM: Changes are Here So What’s ...IMC Summit 2016 Keynote - Arthur Sainio - NVDIMM: Changes are Here So What’s ...
IMC Summit 2016 Keynote - Arthur Sainio - NVDIMM: Changes are Here So What’s ...
 
Scaling Up vs. Scaling-out
Scaling Up vs. Scaling-outScaling Up vs. Scaling-out
Scaling Up vs. Scaling-out
 
Partner reseller proposal ppt
Partner reseller proposal pptPartner reseller proposal ppt
Partner reseller proposal ppt
 
4 Ways To Save Big Money in Your Data Center and Private Cloud
4 Ways To Save Big Money in Your Data Center and Private Cloud4 Ways To Save Big Money in Your Data Center and Private Cloud
4 Ways To Save Big Money in Your Data Center and Private Cloud
 
Life Beyond Flash - ReRAM for embedded and memory-centric architectures in Io...
Life Beyond Flash - ReRAM for embedded and memory-centric architectures in Io...Life Beyond Flash - ReRAM for embedded and memory-centric architectures in Io...
Life Beyond Flash - ReRAM for embedded and memory-centric architectures in Io...
 
Adaptec Hybrid RAID Whitepaper
Adaptec Hybrid RAID WhitepaperAdaptec Hybrid RAID Whitepaper
Adaptec Hybrid RAID Whitepaper
 
BM Brings Enterprise Functionality to Mid-Range Storage
BM Brings Enterprise Functionality to Mid-Range StorageBM Brings Enterprise Functionality to Mid-Range Storage
BM Brings Enterprise Functionality to Mid-Range Storage
 
Dell SSD og Flash teknologi i SAN
Dell SSD og Flash teknologi i SANDell SSD og Flash teknologi i SAN
Dell SSD og Flash teknologi i SAN
 
Benchmarking a Scalable and Highly Available Architecture for Virtual Desktops
Benchmarking a Scalable and Highly Available Architecture for Virtual DesktopsBenchmarking a Scalable and Highly Available Architecture for Virtual Desktops
Benchmarking a Scalable and Highly Available Architecture for Virtual Desktops
 
White paper whitewater-datastorageinthecloud
White paper whitewater-datastorageinthecloudWhite paper whitewater-datastorageinthecloud
White paper whitewater-datastorageinthecloud
 
DDN and Intel: Partnered for Exascale
DDN and Intel: Partnered for ExascaleDDN and Intel: Partnered for Exascale
DDN and Intel: Partnered for Exascale
 
Evaluating MLC vs TLC vs V-NAND for Enterprise SSDs – Whitepaper
Evaluating MLC vs TLC vs V-NAND for Enterprise SSDs – WhitepaperEvaluating MLC vs TLC vs V-NAND for Enterprise SSDs – Whitepaper
Evaluating MLC vs TLC vs V-NAND for Enterprise SSDs – Whitepaper
 
Ddn Vision
Ddn VisionDdn Vision
Ddn Vision
 
Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything)
Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything)Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything)
Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything)
 
Significantly Improving Storage Efficiency — IBM Adds Real-time Compression t...
Significantly Improving Storage Efficiency — IBM Adds Real-time Compression t...Significantly Improving Storage Efficiency — IBM Adds Real-time Compression t...
Significantly Improving Storage Efficiency — IBM Adds Real-time Compression t...
 
Scalar, nimble, brocade, commvault, star trek into darkness, toronto, 05 16 2013
Scalar, nimble, brocade, commvault, star trek into darkness, toronto, 05 16 2013Scalar, nimble, brocade, commvault, star trek into darkness, toronto, 05 16 2013
Scalar, nimble, brocade, commvault, star trek into darkness, toronto, 05 16 2013
 
Panzura Global Storage System
Panzura Global Storage SystemPanzura Global Storage System
Panzura Global Storage System
 
1f Backup
1f   Backup1f   Backup
1f Backup
 
AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012
AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012
AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012
 

Viewers also liked

Deutsche EuroShop | Company Presentation | 11/10
Deutsche EuroShop | Company Presentation | 11/10Deutsche EuroShop | Company Presentation | 11/10
Deutsche EuroShop | Company Presentation | 11/10
Deutsche EuroShop AG
 
Investigation and analysis of metallurgical and mechanical properties of aa11...
Investigation and analysis of metallurgical and mechanical properties of aa11...Investigation and analysis of metallurgical and mechanical properties of aa11...
Investigation and analysis of metallurgical and mechanical properties of aa11...
Alexander Decker
 
Expanded Public-Private Partnerships for Innovation: Approaches from Canada a...
Expanded Public-Private Partnerships for Innovation: Approaches from Canada a...Expanded Public-Private Partnerships for Innovation: Approaches from Canada a...
Expanded Public-Private Partnerships for Innovation: Approaches from Canada a...
University Economic Development Association
 
Production Planning Engineer Job Ref 1501-7 - Best Viewed in Full Screen
Production Planning Engineer Job Ref 1501-7 - Best Viewed in Full Screen Production Planning Engineer Job Ref 1501-7 - Best Viewed in Full Screen
Production Planning Engineer Job Ref 1501-7 - Best Viewed in Full Screen
Sean Durrant
 
Green Infrastructure Workshop for Design Professionals
Green Infrastructure Workshop for Design ProfessionalsGreen Infrastructure Workshop for Design Professionals
Green Infrastructure Workshop for Design Professionals
New Jersey Future
 
Jersey Water Works Green Infrastructure Training at NJ Society of Municipal E...
Jersey Water Works Green Infrastructure Training at NJ Society of Municipal E...Jersey Water Works Green Infrastructure Training at NJ Society of Municipal E...
Jersey Water Works Green Infrastructure Training at NJ Society of Municipal E...
New Jersey Future
 

Viewers also liked (7)

Deutsche EuroShop | Company Presentation | 11/10
Deutsche EuroShop | Company Presentation | 11/10Deutsche EuroShop | Company Presentation | 11/10
Deutsche EuroShop | Company Presentation | 11/10
 
Investigation and analysis of metallurgical and mechanical properties of aa11...
Investigation and analysis of metallurgical and mechanical properties of aa11...Investigation and analysis of metallurgical and mechanical properties of aa11...
Investigation and analysis of metallurgical and mechanical properties of aa11...
 
Cali
CaliCali
Cali
 
Expanded Public-Private Partnerships for Innovation: Approaches from Canada a...
Expanded Public-Private Partnerships for Innovation: Approaches from Canada a...Expanded Public-Private Partnerships for Innovation: Approaches from Canada a...
Expanded Public-Private Partnerships for Innovation: Approaches from Canada a...
 
Production Planning Engineer Job Ref 1501-7 - Best Viewed in Full Screen
Production Planning Engineer Job Ref 1501-7 - Best Viewed in Full Screen Production Planning Engineer Job Ref 1501-7 - Best Viewed in Full Screen
Production Planning Engineer Job Ref 1501-7 - Best Viewed in Full Screen
 
Green Infrastructure Workshop for Design Professionals
Green Infrastructure Workshop for Design ProfessionalsGreen Infrastructure Workshop for Design Professionals
Green Infrastructure Workshop for Design Professionals
 
Jersey Water Works Green Infrastructure Training at NJ Society of Municipal E...
Jersey Water Works Green Infrastructure Training at NJ Society of Municipal E...Jersey Water Works Green Infrastructure Training at NJ Society of Municipal E...
Jersey Water Works Green Infrastructure Training at NJ Society of Municipal E...
 

Similar to Re-inventing the Database: What to Keep and What to Throw Away

JasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max SchiresonJasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max SchiresonMongoDB
 
Webinar: The All-Flash Data Center, Myth or Reality?
Webinar: The All-Flash Data Center, Myth or Reality?Webinar: The All-Flash Data Center, Myth or Reality?
Webinar: The All-Flash Data Center, Myth or Reality?
Storage Switzerland
 
Dell whitepaper busting solid state storage myths
Dell whitepaper busting solid state storage mythsDell whitepaper busting solid state storage myths
Dell whitepaper busting solid state storage myths
Natalie Cerullo
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
Andraz Tori
 
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
StampedeCon
 
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBeyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Ben Stopford
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Archguest18a0f1
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
mclee
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
royans
 
Big Data - Need of Converged Data Platform
Big Data - Need of Converged Data PlatformBig Data - Need of Converged Data Platform
Big Data - Need of Converged Data Platform
GeekNightHyderabad
 
Webinar: Where is the All-Flash Data Center?
Webinar: Where is the All-Flash Data Center?Webinar: Where is the All-Flash Data Center?
Webinar: Where is the All-Flash Data Center?
Storage Switzerland
 
Vectorization whitepaper
Vectorization whitepaperVectorization whitepaper
Vectorization whitepaper
VIVEKSINGH634333
 
Nimbus Partner Solutions Brief
Nimbus Partner Solutions BriefNimbus Partner Solutions Brief
Nimbus Partner Solutions Brief
IT Brand Pulse
 
Keith Norbie Flash Storage decision methodology - mnvmug
Keith Norbie Flash Storage decision methodology - mnvmugKeith Norbie Flash Storage decision methodology - mnvmug
Keith Norbie Flash Storage decision methodology - mnvmug
Keith Norbie
 
Webcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond HadoopWebcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond Hadoop
Impetus Technologies
 
Eliminating the Problems of Exponential Data Growth, Forever
Eliminating the Problems of Exponential Data Growth, ForeverEliminating the Problems of Exponential Data Growth, Forever
Eliminating the Problems of Exponential Data Growth, Forever
spectralogic
 
The Future of Distributed Databases
The Future of Distributed DatabasesThe Future of Distributed Databases
The Future of Distributed Databases
NuoDB
 
Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...
Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...
Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...
Avere Systems
 
Big data-denis-rothman
Big data-denis-rothmanBig data-denis-rothman
Big data-denis-rothman
Denis Rothman
 
NoSQL
NoSQLNoSQL

Similar to Re-inventing the Database: What to Keep and What to Throw Away (20)

JasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max SchiresonJasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max Schireson
 
Webinar: The All-Flash Data Center, Myth or Reality?
Webinar: The All-Flash Data Center, Myth or Reality?Webinar: The All-Flash Data Center, Myth or Reality?
Webinar: The All-Flash Data Center, Myth or Reality?
 
Dell whitepaper busting solid state storage myths
Dell whitepaper busting solid state storage mythsDell whitepaper busting solid state storage myths
Dell whitepaper busting solid state storage myths
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
 
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
 
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBeyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
Big Data - Need of Converged Data Platform
Big Data - Need of Converged Data PlatformBig Data - Need of Converged Data Platform
Big Data - Need of Converged Data Platform
 
Webinar: Where is the All-Flash Data Center?
Webinar: Where is the All-Flash Data Center?Webinar: Where is the All-Flash Data Center?
Webinar: Where is the All-Flash Data Center?
 
Vectorization whitepaper
Vectorization whitepaperVectorization whitepaper
Vectorization whitepaper
 
Nimbus Partner Solutions Brief
Nimbus Partner Solutions BriefNimbus Partner Solutions Brief
Nimbus Partner Solutions Brief
 
Keith Norbie Flash Storage decision methodology - mnvmug
Keith Norbie Flash Storage decision methodology - mnvmugKeith Norbie Flash Storage decision methodology - mnvmug
Keith Norbie Flash Storage decision methodology - mnvmug
 
Webcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond HadoopWebcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond Hadoop
 
Eliminating the Problems of Exponential Data Growth, Forever
Eliminating the Problems of Exponential Data Growth, ForeverEliminating the Problems of Exponential Data Growth, Forever
Eliminating the Problems of Exponential Data Growth, Forever
 
The Future of Distributed Databases
The Future of Distributed DatabasesThe Future of Distributed Databases
The Future of Distributed Databases
 
Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...
Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...
Share on LinkedIn Share on Twitter Share on Facebook Share on Google+ Share b...
 
Big data-denis-rothman
Big data-denis-rothmanBig data-denis-rothman
Big data-denis-rothman
 
NoSQL
NoSQLNoSQL
NoSQL
 

More from DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
DATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
DATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
DATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
DATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
DATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
DATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
DATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
DATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
DATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
 

More from DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Recently uploaded

Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
Globus
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 

Recently uploaded (20)

Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 

Re-inventing the Database: What to Keep and What to Throw Away

  • 1. Reinventing the Database Max Schireson President, 10gen
  • 2. My background At Oracle from 1994 to 2003 At MarkLogic from 2003 to Feb 2011 Join 10gen Feb 2011
  • 3. The world has changed 1970 2011 Main memory Intel 1103, 1k bits 4GB of RAM costs $25.99 $25 99 Mass storage IBM 3330 Model 1, 100 3TB Superspeed USB MB for $129 Microprocessor Nearly – 4004 being Westmere EX has 10 developed; 4 bits and cores, 30MB L3 cache, 92,000 instructions per runs at 2.4GHz second Motor Trend Car of the Ford Torino Chevy Volt Year President Richard Nixon Barack Obama Ted Codd In his 40’s Dead Me In diapers In my 40s
  • 4. More recent changes A decade ago Now Faster Buy a bigger server Buy more servers Faster t F t storage A SAN with more ith SSD spindles More reliable storage More expensive SAN More copies of local storage Deployed in Your data center The cloud – private or public Large user base Thousands - Millions - consumers employees Tracking Business transactions Every click and more
  • 5. Assumptions behind todays DBMS Relational data model Third normal form ACID SQL Q Multi- Multi-statement transactions Database is hardware agnostic RAM is small and disks are slow If its too slow you can buy a faster computer
  • 6. Yesterday’s assumptions in today’s t d ’ worldld Scaleout is hard Distributed joins are hard Making two-phase commits fast is hard two- Custom solutions proliferate p Too slow? Just add a cache ORM t l everywhere tools h More computers and disk are nearly free but SAN and f d faster computers are expensive i
  • 7. Challenging some assumptions ti Do you need a database at all How does it scale out What type of queries does it need to be able to do How should it model data How do you query it How does it handle transactions and consistency Is i I it enterprise software, open source, an appliance, or a cloud service i f li l d i Does the data fit in memory? What if your disks are SSD?
  • 8. My opinions Different use cases will produce different answers Existing RDBMS solutions will continue to solve a broad set of problems well but many applications will work better on top of alternative technologies Many new technologies will find niches but only one or two will become mainstream
  • 9. Do you need a database at all ll Can you better solve your problem with a batch processing framework Can you better solve your problem with an in memory object store/cache
  • 10. How does it scale out Scale- Scale-out for working set size Scale- Scale-out for total data size Scale out for write volume Scale- Scale-out for read volume Scale- Scale-out for redundancy How do you incrementally add nodes or change configuration How do you trade off query performance (which wants fewer index segments) for elasticity (which wants more index segments))
  • 11. What type of queries does it need t b able to d d to be bl t do Is a key/value store enough Will you be retrieving your data by one key or by many Is there a primary way you ll be viewing your data you’ll Do you need specialized queries (eg, time series, (eg, geospatial)
  • 12. Imagine a garage… You hand your valet the keys to your car Before they park your car, they completely disassemble it The pistons are stored in piston storage, brake pads with brake pads, steering p p g p p g wheels with steering wheels Over time, they have storage areas for catalytic converters, DVD-based nav DVD- systems, headlight washers, and traction control systems When you ask for your car back, the valet is incredibly fast at reassembly One minor issue: you have to provide the disassembly and reassembly instructions and they will be followed literally, even if you say the spare tire should be used as a steering wheel and forgot to specify re-insertion of spark plugs re- A technological marvel Might be a good way to store your car if you don’t know whether you’ll be asking for a car back or lots of brake pads or pistons – for a salvage yard?
  • 13. How should it model data Relational Row oriented or column oriented Key value Document oriented Graph oriented
  • 14. How do you query it Do you want an API, a language, or a map-reduce map- style interface? Will most of your queries be hand-typed, embedded hand- in code or dynamically generated
  • 15. How do you handle transactions and consistency t ti d i t Do you need transactions at all Be careful; web services, for example, need to be able to assign userIDs Do you need multi-master updates multi- If so, how do y resolve conflicts , you Do you need immediate consistency? For some queries or all? How do you handle failures Are you optimizing for read availability or write availability
  • 16. What is it Enterprise software Open source p With commercial support? Appliance Packaged with commodity hardware Specialized hardware Cloud Cl d service i Available for on-premise deployment? on- Integrated in another PaaS offering? Where on the net?
  • 17. Does the data fit in memory Transactions can be very very fast Do you trust enough copies in memory (perhaps across multiple data centers) or do you require some sort of sync to persistent storage How big will the data be and how much do you care about costs
  • 18. What if your disks are SSD Alleviate hotspots Random accesses are measured in microseconds not milliseconds Degradation from in-memory to on-disk can be in- on- more graceful But data representations on disk vs in memory may be very different which may create significant overhead
  • 19. In choosing a solution Examine your requirements They will dictate certain choices Once you have narrowed the field Prefer solutions that may become mainstream y Consider TCO: Purchase cost Learning curve L i Productivity Viability
  • 20. Which solution sets will become mainstream b i t High confidence Horizontally scalable: to take advantage of hardware trends Non- Non-relational: to enable scalability Highly functional: for usage beyond mega-scale mega- Developer- Developer-friendly: because decision making has shifted Freely available: for rapid adoption My predictions Document oriented: enables scalability, functionality, developer friendliness, and agility Open source: with multiple PaaS providers