DATA AND
DISILLUSIONMENT


SOLVEfor
INTERESTING
OTHERWISE LIFE IS DULL.
Volume
              (the “big”
                 part)


              Pick
              any
 Velocity
              two              Variety
(the “fast”                      (the
   part)                   “anything” part)
Big Data is the Third Age of computing


  Computing      Networking                         Big Data

    Automate      Interconnect                  Predict & change
     things          things                          things




                                 (Jim Stodgill of O’Reilly Radar said this.)
Enterprises expect Big Data to deliver better
decisions and improved customer experiences
      What tangible benefits do you hope to achieve
            through your big data initiatives?




                                   NewVantage Partners LLC www.newvantage.com
(And apparently Hadoop is winning)

         What data management approaches
                are you considering?




                              NewVantage Partners LLC www.newvantage.com
The
relational
database
is a general-
purpose
tool.
A library is
                                                                                a database
                                                                                optimized
                                                                                for retrieval




Photo by cybrgrrl (http://www.flickr.com/photos/cybrgrl/1295482521/) on Flickr
A change
counter is a
database
optimized for
insertion
An example:
eventual
consistency
“End of Day Balance will only appear for dates previous to
the last 2 business days.”
“Transactions from today are reflected in your balance, but
may not be displayed on this page if you recently updated
your bankbook, if a paper statement was recently issued, or
if a transaction is backdated. These transactions will appear
in your history the following business day.”
Relational




             BIG


                   Statistical
http://www.flickr.com/photos/jenny-pics/3239638494/sizes/l/




                            Breadcrumb trail
The average enterprise has 178 social
media accounts




            (According to @setlinger and the Altimeter group.)
Ward off disease.
            Pinpoint disasters.
A force     Reveal corruption.
for good.
            Make cities smarter.
            Improve how we teach.
Big healthcare
Big philanthropy
Big commuting
Erode our privacy.
           Justify prejudices.
A force    Polarize groups.
for bad.
           Leak private truths.
Big prejudice
“…nobody notices offers they do not
get. And if these absent opportunities
start following certain social patterns
(for example not offering them to
certain races, genders or sexual
preferences) they can have a deep civil
rights effect.”
                 Anders Sandberg, Oxford University
Personalization looks a lot
     like prejudice.
Big radio
Times a song in “heavy rotation”
is played each day
30

                           Every 55m


15



        Every 4h
0
          2007               2012
Humans are bad at data.
We prefer false positives.
Wooly mammoth



http://www.flickr.com/photos/pong/172438102/sizes/o/
Sun temple



http://www.flickr.com/photos/30787002@N02/3298693694/sizes/l/
Some proof.
It’s really hard to find people who can think
about data well

      How challenging is it to source data scientists?




                                     NewVantage Partners LLC www.newvantage.com
Mistake correlation for causality
Seek truthiness rather than fact
Find patterns where they don’t exist
Easily swayed by tone
Side with our tribes
Dig in and ignore new evidence
Athenian swimming pools
Volume
           Big
Variety    Data   Good
                  data
Velocity
Veracity
525,000 state & local officers
Under 25 officers per precinct
130 million incident reports
200,000 uses of force
31% keep computer files
Evidence.com
Hard drive
Big Data is not about data.
Big Data is about truth,
auditability, and the ability to
  analyze data on a level
        playing field.

   It’s about analysis for
          everyone.
Alistair Croll
                          @acroll
                          www.solveforinteresting.com

THANKS!                   alistair@solveforinteresting.com




SOLVEfor
INTERESTING
OTHERWISE LIFE IS DULL.

Big data tokyo (extended version)