www.Objectivity.com
Welcome!
Webinar: Big Data – NoSQL
Technology and Real-time,
Accurate Predictive
Analytics
© Objectivity Inc 2013
Agenda
Market Overview
• Presented by Matt Aslett, Research Director at 451 Group
Big Data Use Case
• Presented by J.C. Smart, Director Global Insight Laboratory at Georgetown
University
Q&A
• Presented by
• Matt Aslett, Research Director at 451 Group
• J.C. Smart, Director Global Insight Laboratory at Georgetown University
• Leon Guzenda, Founder at Objectivty, Inc.
© Objectivity Inc 2013
© 2013 by The 451 Group. All rights reserved
 Matthew Aslett
• Research Director, Data Management and Analytics
 matthew.aslett@451research.com
 www.twitter.com/maslett
 Responsible for data management
and analytics research agenda
 Focus on operational and analytic
databases, including NoSQL,
NewSQL, and Hadoop
 With 451 Research since 2007
© 2013 by The 451 Group. All rights reserved
Company Overview
 One company with 3 operating
divisions
 Syndicated research, advisory,
professional services, datacenter
certification, and events
 Global focus
 200+ staff
 1,300+ client organizations:
enterprises, vendors, service
providers, and investment firms
 Organic and growth through
acquisition
© 2013 by The 451 Group. All rights reserved
Unique combination of research, analysis & data
Emerging tech market segment focus
Daily qualitative & quantitative insight
Analyst advisory & Go-to-market support
Global events
© 2013 by The 451 Group. All rights reserved
What has driven the development and adoption of NoSQL?
 NoSQL, NewSQL and Beyond
• Assessing the drivers behind the development and adoption
of NoSQL and NewSQL databases, as well as data
grid/caching technologies
• Released April 2011
• Role of open source in driving innovation
• sales@the451group.com
 MySQL vs NoSQL and NewSQL
• Released May 2012
 Next-generation Operational Databases
• Released July 2013
© 2013 by The 451 Group. All rights reserved
SPRAINED RELATIONAL DATABASES
Photo credit:
Foxtongue on Flickr
http://www.flickr.com/photos/foxtongue/4
844016087/
© 2013 by The 451 Group. All rights reserved
Database SPRAIN
 The traditional relational database has been stretched beyond its
normal capacity by the needs of high-volume, highly distributed or
highly complex applications.
 There are workarounds – such as DIY sharding – but manual,
homegrown efforts can result in database administrators being
stretched beyond their normal capacity in terms of managing
complexity.
 Scalability
 Performance
 Relaxed consistency Increased willingness to look towards
 Agility emerging alternatives
 Intricacy
 Necessity
© 2013 by The 451 Group. All rights reserved
Necessity is the mother of NoSQL
 Hadoop and NoSQL innovation did not come from existing relational
database and storage suppliers
 It came from Google, Amazon, Facebook, Yahoo, LinkedIn and open
source communities…
 This has significantly altered the relationship between customer and
vendor, and changed the database landscape enormously
 And also generated a new breed of database vendors and database
products
“We couldn’t bet the company on other companies building
the answer for us.”
– Werner Vogels, Amazon CTO
© 2013 by The 451 Group. All rights reserved
The NoSQL database landscape
Wide-column
stores
Data is mapped by
a row key, column
key and time
stamp.
Key Value
Stores
Store keys and
associated values.
Graph
databases
Store data and the
relationships
between data.
Document
stores
Store all data
related to a
specific key as a
single document.
DATA MODEL COMPLEXITY
© 2013 by The 451 Group. All rights reserved
The NoSQL database landscape
Wide-column
stores
Data is mapped by
a row key, column
key and time
stamp.
Key Value
Stores
Store keys and
associated values.
Graph
databases
Store data and the
relationships
between data.
Document
stores
Store all data
related to a
specific key as a
single document.
Multi-model databases
Support a combination of the various individual NoSQL data
models.
DATA MODEL COMPLEXITY
© 2013 by The 451 Group. All rights reserved
The NoSQL database landscape
 Graph databases not only store data in a
collection of key-value pairs, known as nodes and
properties, but also store the relationships – or
edges – that connect nodes to other nodes, or
nodes to properties.
 Users can navigate – or traverse – the resulting
graph by nodes, properties or edges to identify
and analyze relationships between nodes and
properties.
 This is inherently more flexible than traditional
approaches that would require cross-table joins in
relational databases.
Graph
databases
Store data and the
relationships
between data.
© 2013 by The 451 Group. All rights reserved
The NoSQL database landscape
 Graph databases are more than just a new way of
storing data
 Graph databases enable analysis of not just
individual or aggregate data, but also the
relationships between data
 Graph databases potentially provide new
opportunities for generating business intelligence
by highlighting new patterns in data
Graph
databases
Store data and the
relationships
between data.
© 2013 by The 451 Group. All rights reserved
Graph analytics
 The rise of graph databases is closely linked to the
rise of social networking
 It could be argued that the most valuable assets
that Facebook, Twitter and LinkedIn own are the
graphs that represent the relationships between
their users and their users’ interests
 However, the roots of graph analytics can be traced
back much further, all the way to Leonhard Euler’s
Seven Bridges of Königsberg, published in 1736
Graph
databases
Store data and the
relationships
between data.
© 2013 by The 451 Group. All rights reserved
Seven Bridges of Königsberg (now Kaliningrad)
 Find a route crossing each bridge once, and only one
• Euler proved there was no solution
Source: Wikipedia http://en.wikipedia.org/wiki/File:Konigsberg_bridges.png
© 2013 by The 451 Group. All rights reserved
Seven Bridges of Königsberg (now Kaliningrad)
 Relevance today:
• Google uses graph theory to find the most efficient routes for Street
View cars to capture images for Google Maps
© 2013 by The 451 Group. All rights reserved
Other applications
 Less obvious applications include customer management
• E.g. Financial services firm with multiple business units
PARENT CO
LOANBANKING
CHECKING CREDIT CARD
INSURANCE PENSION
HOUSE INSURANCE CAR INSURANCE
© 2013 by The 451 Group. All rights reserved
Other applications
 Less obvious applications include customer management
• E.g. Financial services firm with multiple business units
• What happens when an individual has multiple customer relationships?
PARENT CO
LOANBANKING
CHECKING CREDIT CARD
INSURANCE PENSION
HOUSE INSURANCE CAR INSURANCE
© 2013 by The 451 Group. All rights reserved
Other applications
 Less obvious applications include customer management
• E.g. Financial services firm with multiple business units
• What happens when an individual has multiple customer relationships?
• Graph analysis to identify multiple services related to an individual
PARENT CO
LOANBANKING
CHECKING CREDIT CARD
INSURANCE PENSION
HOUSE INSURANCE CAR INSURANCE
© 2013 by The 451 Group. All rights reserved
Other applications
 Less obvious applications include customer management
• E.g. Financial services firm with multiple business units
• What happens when an individual has multiple customer relationships?
• Graph analysis to identify multiple services related to an individual
• And provide a customer-centric relationship perspective
CUSTOMER
PENSIONLOANCHECKING HOUSE INSURANCE
© 2013 by The 451 Group. All rights reserved
Exploratory analysis/discovery
 While BI involves analyzing data for answers to existing questions,
exploratory analytics/discovery involves exploring patterns in data
to prompt new questions
 This search for patterns requires a platform that offers more
flexibility than the schema-on-write approach of the EDW and
traditional analytics
• Statistical analytics
• Predictive analytics
• Machine learning
 The search for patterns also lends itself to analyzing not just data,
but relationships between data
• Graph analysis
© 2013 by The 451 Group. All rights reserved
Conclusion
 NoSQL development was driven by the need for new approaches to
scalability, performance, consistency, agility and intricacy
 Initiated by Web startups, it has generated a new breed of database
vendors and database products
 Graph databases enable analysis of not just individual or aggregate
data, but also the relationships between data
 While the rise of graph databases is closely linked to the rise of
social networking, use-cases include anything that involves
relationships between entities
 Graph databases are expanding the market for analytics
© 2013 by The 451 Group. All rights reserved
Questions? Comments?
matthew.aslett@451research.com
@maslett
Big Data Use Case:
Georgetown University
© Objectivity Inc 2013
J. C. Smart, Ph.D.
Georgetown University
August 2013
Global Insight
The world is an important place…
...and it has a few problems
7 billion people, 40,000 cities, 5 billion cell phones, 800 million vehicles, 12 million miles of paved roads, 50,000 airports, ...
The world is a complex system of
interdependent complex systems
Climate Population Political Energy
Social Poverty Transportation Trade
Communications Terrorism Crime Health
There is an enormous diversity of topics,
scales, fidelity, time, duration, …
Geospatial, cyberspatial, real-time, historical,
predictive, hypothetical, virtual, on and on….
Data exists in many different forms….
Real-time Feeds Applications Databases Spreadsheets
Files Photos Audio Sensors
Websites Models Systems Plans/Maps
The “High-Yield” Knowledge Phenomena
High-Yield
Potential
Low-Yield
Potential
?
Information Inferiority Information Superiority
“Anything,
Anytime,
Anywhere”
“Some things,
Some of the time,
Somewhere”
Intelligence
Saturation
Knowledge Gap
“Critical Mass”
Intelligence
Starvation
9/3/2013
Why is “connecting-the-dots” so hard?
• Plumbing: Massive logistics problem to integrate thousands of
government/non-government data systems at scale
Different standards, models, security, infrastructure, procedures,
policies, networks, access, compartments, applications, tools,
protocols, etc. … all at immense scale!
• Protection: Large-scale integration of data resources increases
cyber security risks
Prevention of adversary exploitation of strategic national assets.
• Patterns: Lack of analytic algorithm techniques to automatically
detect data patterns and alert
Transition from “analytic dumpster diving” to early-warning indication
and real-time notification
• Privacy: Significant tension between security and liberty
Who trusts the “watchers”?
Who watches the watchers?
9/3/2013
The FOUR-Color Framework
Overview
Black
Layer
Black Layer
Analytic
Analytic
Knowledge Space
Analytic
Analytic
Analytic
Analytic
Analytic
Analytic
Analytic
Engine
Analytic
Engine
Analytic
Engine
Analytic
Engine
API
API
API
API
Global insight is now possible!
• Techniques derived from innovations at LLNL, DoD,
Raytheon, Georgetown, [many others] – enabled by
HPC
• Extremely powerful, very effective, not for the timid
• Represents global systems
as trillions of interacting
objects
• Scaling, privacy, and
protection achieved through
a unique data to information
transformation (overlay)
technique
9/3/2013
Q&A
© Objectivity Inc 2013
A copy of the webinar including QA will be available online at
www.Objectivity.com.
A follow up email incorporating answers to questions that may
not have been answered live will be sent out following the
webinar.
Thank you for joining us!

NoSQL Technology and Real-time, Accurate Predictive Analytics

  • 1.
    www.Objectivity.com Welcome! Webinar: Big Data– NoSQL Technology and Real-time, Accurate Predictive Analytics © Objectivity Inc 2013
  • 2.
    Agenda Market Overview • Presentedby Matt Aslett, Research Director at 451 Group Big Data Use Case • Presented by J.C. Smart, Director Global Insight Laboratory at Georgetown University Q&A • Presented by • Matt Aslett, Research Director at 451 Group • J.C. Smart, Director Global Insight Laboratory at Georgetown University • Leon Guzenda, Founder at Objectivty, Inc. © Objectivity Inc 2013
  • 3.
    © 2013 byThe 451 Group. All rights reserved  Matthew Aslett • Research Director, Data Management and Analytics  matthew.aslett@451research.com  www.twitter.com/maslett  Responsible for data management and analytics research agenda  Focus on operational and analytic databases, including NoSQL, NewSQL, and Hadoop  With 451 Research since 2007
  • 4.
    © 2013 byThe 451 Group. All rights reserved Company Overview  One company with 3 operating divisions  Syndicated research, advisory, professional services, datacenter certification, and events  Global focus  200+ staff  1,300+ client organizations: enterprises, vendors, service providers, and investment firms  Organic and growth through acquisition
  • 5.
    © 2013 byThe 451 Group. All rights reserved Unique combination of research, analysis & data Emerging tech market segment focus Daily qualitative & quantitative insight Analyst advisory & Go-to-market support Global events
  • 6.
    © 2013 byThe 451 Group. All rights reserved What has driven the development and adoption of NoSQL?  NoSQL, NewSQL and Beyond • Assessing the drivers behind the development and adoption of NoSQL and NewSQL databases, as well as data grid/caching technologies • Released April 2011 • Role of open source in driving innovation • sales@the451group.com  MySQL vs NoSQL and NewSQL • Released May 2012  Next-generation Operational Databases • Released July 2013
  • 7.
    © 2013 byThe 451 Group. All rights reserved SPRAINED RELATIONAL DATABASES Photo credit: Foxtongue on Flickr http://www.flickr.com/photos/foxtongue/4 844016087/
  • 8.
    © 2013 byThe 451 Group. All rights reserved Database SPRAIN  The traditional relational database has been stretched beyond its normal capacity by the needs of high-volume, highly distributed or highly complex applications.  There are workarounds – such as DIY sharding – but manual, homegrown efforts can result in database administrators being stretched beyond their normal capacity in terms of managing complexity.  Scalability  Performance  Relaxed consistency Increased willingness to look towards  Agility emerging alternatives  Intricacy  Necessity
  • 9.
    © 2013 byThe 451 Group. All rights reserved Necessity is the mother of NoSQL  Hadoop and NoSQL innovation did not come from existing relational database and storage suppliers  It came from Google, Amazon, Facebook, Yahoo, LinkedIn and open source communities…  This has significantly altered the relationship between customer and vendor, and changed the database landscape enormously  And also generated a new breed of database vendors and database products “We couldn’t bet the company on other companies building the answer for us.” – Werner Vogels, Amazon CTO
  • 10.
    © 2013 byThe 451 Group. All rights reserved The NoSQL database landscape Wide-column stores Data is mapped by a row key, column key and time stamp. Key Value Stores Store keys and associated values. Graph databases Store data and the relationships between data. Document stores Store all data related to a specific key as a single document. DATA MODEL COMPLEXITY
  • 11.
    © 2013 byThe 451 Group. All rights reserved The NoSQL database landscape Wide-column stores Data is mapped by a row key, column key and time stamp. Key Value Stores Store keys and associated values. Graph databases Store data and the relationships between data. Document stores Store all data related to a specific key as a single document. Multi-model databases Support a combination of the various individual NoSQL data models. DATA MODEL COMPLEXITY
  • 12.
    © 2013 byThe 451 Group. All rights reserved The NoSQL database landscape  Graph databases not only store data in a collection of key-value pairs, known as nodes and properties, but also store the relationships – or edges – that connect nodes to other nodes, or nodes to properties.  Users can navigate – or traverse – the resulting graph by nodes, properties or edges to identify and analyze relationships between nodes and properties.  This is inherently more flexible than traditional approaches that would require cross-table joins in relational databases. Graph databases Store data and the relationships between data.
  • 13.
    © 2013 byThe 451 Group. All rights reserved The NoSQL database landscape  Graph databases are more than just a new way of storing data  Graph databases enable analysis of not just individual or aggregate data, but also the relationships between data  Graph databases potentially provide new opportunities for generating business intelligence by highlighting new patterns in data Graph databases Store data and the relationships between data.
  • 14.
    © 2013 byThe 451 Group. All rights reserved Graph analytics  The rise of graph databases is closely linked to the rise of social networking  It could be argued that the most valuable assets that Facebook, Twitter and LinkedIn own are the graphs that represent the relationships between their users and their users’ interests  However, the roots of graph analytics can be traced back much further, all the way to Leonhard Euler’s Seven Bridges of Königsberg, published in 1736 Graph databases Store data and the relationships between data.
  • 15.
    © 2013 byThe 451 Group. All rights reserved Seven Bridges of Königsberg (now Kaliningrad)  Find a route crossing each bridge once, and only one • Euler proved there was no solution Source: Wikipedia http://en.wikipedia.org/wiki/File:Konigsberg_bridges.png
  • 16.
    © 2013 byThe 451 Group. All rights reserved Seven Bridges of Königsberg (now Kaliningrad)  Relevance today: • Google uses graph theory to find the most efficient routes for Street View cars to capture images for Google Maps
  • 17.
    © 2013 byThe 451 Group. All rights reserved Other applications  Less obvious applications include customer management • E.g. Financial services firm with multiple business units PARENT CO LOANBANKING CHECKING CREDIT CARD INSURANCE PENSION HOUSE INSURANCE CAR INSURANCE
  • 18.
    © 2013 byThe 451 Group. All rights reserved Other applications  Less obvious applications include customer management • E.g. Financial services firm with multiple business units • What happens when an individual has multiple customer relationships? PARENT CO LOANBANKING CHECKING CREDIT CARD INSURANCE PENSION HOUSE INSURANCE CAR INSURANCE
  • 19.
    © 2013 byThe 451 Group. All rights reserved Other applications  Less obvious applications include customer management • E.g. Financial services firm with multiple business units • What happens when an individual has multiple customer relationships? • Graph analysis to identify multiple services related to an individual PARENT CO LOANBANKING CHECKING CREDIT CARD INSURANCE PENSION HOUSE INSURANCE CAR INSURANCE
  • 20.
    © 2013 byThe 451 Group. All rights reserved Other applications  Less obvious applications include customer management • E.g. Financial services firm with multiple business units • What happens when an individual has multiple customer relationships? • Graph analysis to identify multiple services related to an individual • And provide a customer-centric relationship perspective CUSTOMER PENSIONLOANCHECKING HOUSE INSURANCE
  • 21.
    © 2013 byThe 451 Group. All rights reserved Exploratory analysis/discovery  While BI involves analyzing data for answers to existing questions, exploratory analytics/discovery involves exploring patterns in data to prompt new questions  This search for patterns requires a platform that offers more flexibility than the schema-on-write approach of the EDW and traditional analytics • Statistical analytics • Predictive analytics • Machine learning  The search for patterns also lends itself to analyzing not just data, but relationships between data • Graph analysis
  • 22.
    © 2013 byThe 451 Group. All rights reserved Conclusion  NoSQL development was driven by the need for new approaches to scalability, performance, consistency, agility and intricacy  Initiated by Web startups, it has generated a new breed of database vendors and database products  Graph databases enable analysis of not just individual or aggregate data, but also the relationships between data  While the rise of graph databases is closely linked to the rise of social networking, use-cases include anything that involves relationships between entities  Graph databases are expanding the market for analytics
  • 23.
    © 2013 byThe 451 Group. All rights reserved Questions? Comments? matthew.aslett@451research.com @maslett
  • 24.
    Big Data UseCase: Georgetown University © Objectivity Inc 2013
  • 25.
    J. C. Smart,Ph.D. Georgetown University August 2013 Global Insight
  • 26.
    The world isan important place… ...and it has a few problems 7 billion people, 40,000 cities, 5 billion cell phones, 800 million vehicles, 12 million miles of paved roads, 50,000 airports, ...
  • 27.
    The world isa complex system of interdependent complex systems Climate Population Political Energy Social Poverty Transportation Trade Communications Terrorism Crime Health
  • 28.
    There is anenormous diversity of topics, scales, fidelity, time, duration, … Geospatial, cyberspatial, real-time, historical, predictive, hypothetical, virtual, on and on….
  • 29.
    Data exists inmany different forms…. Real-time Feeds Applications Databases Spreadsheets Files Photos Audio Sensors Websites Models Systems Plans/Maps
  • 30.
    The “High-Yield” KnowledgePhenomena High-Yield Potential Low-Yield Potential ? Information Inferiority Information Superiority “Anything, Anytime, Anywhere” “Some things, Some of the time, Somewhere” Intelligence Saturation Knowledge Gap “Critical Mass” Intelligence Starvation
  • 31.
  • 32.
    Why is “connecting-the-dots”so hard? • Plumbing: Massive logistics problem to integrate thousands of government/non-government data systems at scale Different standards, models, security, infrastructure, procedures, policies, networks, access, compartments, applications, tools, protocols, etc. … all at immense scale! • Protection: Large-scale integration of data resources increases cyber security risks Prevention of adversary exploitation of strategic national assets. • Patterns: Lack of analytic algorithm techniques to automatically detect data patterns and alert Transition from “analytic dumpster diving” to early-warning indication and real-time notification • Privacy: Significant tension between security and liberty Who trusts the “watchers”? Who watches the watchers?
  • 33.
  • 34.
  • 35.
    Global insight isnow possible! • Techniques derived from innovations at LLNL, DoD, Raytheon, Georgetown, [many others] – enabled by HPC • Extremely powerful, very effective, not for the timid • Represents global systems as trillions of interacting objects • Scaling, privacy, and protection achieved through a unique data to information transformation (overlay) technique
  • 36.
  • 37.
    Q&A © Objectivity Inc2013 A copy of the webinar including QA will be available online at www.Objectivity.com. A follow up email incorporating answers to questions that may not have been answered live will be sent out following the webinar. Thank you for joining us!