Big Data & the Enterprise

Ben Stopford
Ben StopfordEngineer at Confluent
Welcome
It used to be easy…
they all looked pretty much alike
NoSQL       BigData    MapReduce     Graph      Document


             Shared      Column                   Eventual
BigTable                               CAP
             Nothing     Oriented                Consistency


  ACID        BASE       Mongo       Coudera      Hadoop



Voldemort   Cassandra    Dynamo      Marklogic     Redis



 Velocity    Hbase      Hypertable     Riak         BDB
Now it’s downright
c0nfuZ1nG!
What Happened?
we   changed scale
tack
        ge   d
we ch an
the


big data
conundrum
the


big data
conundrum   ?
The Internet
Which isn’t mostly text

Everything (500,000)



     Web Pages (40)
       ~0.01% is web pages


         Words (0.6)

               ~1% of that is text

                              Sizes in Petabytes
And there is lots of other stuff out there
                              mobile


            weather                            sensors




         Social
                                                    Logs
          data




                      audio            video
Gartner

80% of business is conducted on
   unstructured information
Big Data is now a new class of
      economic asset*




       *World economic forum 2012
Yet 80% Enterprise Databases < 1TB




                              Reference
                              from 2009
so what does Big Data mean for

the   enterprise?
Insight α data



        >

Data beats Algorithms
Backing up a bit…
We live in a world of, largely, private data




           Where data is often changed
               and forwarded on
Sometimes we’re a bit more
       organized!
But most of our data is not generally
             accessible
        Core
     Operational
                             Exposed
        Data
Sharing is often an afterthought
       Core
    Operational
                         Exposed
       Data
How do we process, acquire, reason about
       and act upon information?
The Brain
   Reptilian

               Primitive operations:
               balance, temperature
               regulation, breathing
   Mammalian

               Emotion, short-term
               memory, flee of fight
               etc (limbic)
   Neocortex

               Plan, innovate,
               problem solve etc
Our intelligence is segregated in
        disparate worlds
could our corporations be

more   intelligent?
Siloed, closed, bespoke data makes
   our organisations opaque and
            unresponsive
What if we exposed it all?
So what might that look like?
•  Single data store
•  Federated, homogenous stores
•  Federated, heterogeneous stores
The Google Approach
            MapReduce
            Google Filesystem
            BigTable
            Tenzing
            Megastore
            F1
            Dremel
            Spanner
The Ebay Approach
so is one approach

better?
Data Volume?


TB
     0 1 10    100       1000                10,000




         We live well within the overlap region
Academic acumen?
Performance Trade-Off Curve
•  Volume (pure physical size)
•  Velocity (rate of change)
•  Variety (number of different types of data,
   formats and sources)
•  Static & Dynamic Complexity (do you need to
   interpret the affect one message has on
   another)
Problem

Our ability to model data is much more of a
  gating factor than raw size, particularly
   when considering new forms of data

               Dave Campbell
         (Microsoft – VLDB Keynote)
Gravitate around a single data model
              Globally Accessible




                                    Application
                                    Specific
                 Core Data          Models
              Core Data Model
                   Model



Views /
linkages
The data itself follows a similar pattern
              Globally Accessible




                                    Application
                                    Specific
                 Core Data
              Core Data Model       data



Views /
linkages
Compose Solutions (for now)
Big Data is more than the opportunity for
  better insight over new data sources
It is the opportunity to make the
 organisation smarter, simply by
  making data more accessible
But the harder job, for us, is unifying
the various domains to make all that
           data intelligible
Thanks



http://www.benstopford.com
       @benstopford
1 of 44

More Related Content

Similar to Big Data & the Enterprise

Big data managementBig data management
Big data managementzeba khanam
238 views8 slides
Big dataBig data
Big dataMd. Mahedi Mahfuj
691 views3 slides
Big DataBig Data
Big DataNGDATA
67.4K views61 slides

Similar to Big Data & the Enterprise(20)

Big data managementBig data management
Big data management
zeba khanam238 views
Big dataBig data
Big data
Md. Mahedi Mahfuj691 views
Big DataBig Data
Big Data
NGDATA67.4K views
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big Decisions
InnoTech930 views
Big data introductionBig data introduction
Big data introduction
Chirag Ahuja1.3K views
A Big Data ConceptA Big Data Concept
A Big Data Concept
Dharmesh Tank3.1K views
Data mining with big dataData mining with big data
Data mining with big data
Sandip Tipayle Patil5.1K views
Big Data: an introductionBig Data: an introduction
Big Data: an introduction
Bart Vandewoestyne8K views
Big Data and the Cloud a Best Friend StoryBig Data and the Cloud a Best Friend Story
Big Data and the Cloud a Best Friend Story
Amazon Web Services26.7K views
Big dataBig data
Big data
FACTS Computer Software L.L.C2.6K views
An Overview of BigDataAn Overview of BigData
An Overview of BigData
Valarmathi V99 views
Future of Data - Big DataFuture of Data - Big Data
Future of Data - Big Data
Shankar R1.6K views
Big data combatBig data combat
Big data combat
sara stanford191 views
The future of Big Data toolingThe future of Big Data tooling
The future of Big Data tooling
Data Science Society2.9K views
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
Md. Salman Ahmed519 views
11
1
Monika Moni595 views

More from Ben Stopford(18)

Advanced databases   ben stopfordAdvanced databases   ben stopford
Advanced databases ben stopford
Ben Stopford1.3K views
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
Ben Stopford6.9K views

Big Data & the Enterprise