Silicon valley nosql meetup  april 2012
 

Silicon valley nosql meetup april 2012

on

  • 1,350 views

Join Objectivity, Inc.’s VP of Product Management, Brian Clark, in a discussion of the latest trends in Big Data Analytics, defining what is Big Data and understanding how to maximize your existing ...

Join Objectivity, Inc.’s VP of Product Management, Brian Clark, in a discussion of the latest trends in Big Data Analytics, defining what is Big Data and understanding how to maximize your existing architectures by utilizing NOSQL technologies to improve functionality and provide real-time results. There will be a focus on relationship analytics as well as an introduction to NOSQL data stores, object and graph databases, such as the architecture behind Objectivity/DB and InfiniteGraph.

Statistics

Views

Total Views
1,350
Views on SlideShare
1,348
Embed Views
2

Actions

Likes
1
Downloads
22
Comments
0

1 Embed 2

http://us-w1.rockmelt.com 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Please see additional presentation on overview on Big Data Analytics Landscape for references (separate attachment)
  • CUNA mutual – social CRM application to help sell financial products
  • Broad Area Maritime Surveillance Unmanned Aircraft System.
  • Key points on what relationship analytics is: Discovering a relationship between two data nodesGenerally via several degrees of separationEndpoints typically represent “targets”Links or associations between targets form pathsLinks may be phone calls, transactions, meetings etc
  • Objectivity/DB supports major object languages such as Java, C++, C# .NET, Python and SQL++. Objects created in any supported language can be accessed by any other supported language. For instance, a high performance data ingest application can be written in C++, and these objects can be accessed by a GUI application written in Java.Objectivity/DB runs on many different platforms including Windows, Linux, other major Unix platforms and even real time operating systems. Data written on any platform can be accessed from any other supported platform.Objectivity/DB supports major object languages such as Java, C++, C# .NET, Python and SQL++. Objects created in any supported language can be accessed by any other supported language. For instance, a high performance data ingest application can be written in C++, and these objects can then be accessed by a GUI application written in Java.Python can be used to quickly develop new tools and utilities and prototype new algorithms.SQL++ supports access via ODBC compliant tools such as Microsoft Access.Objectivity/DB runs on many different platforms including Windows, Linux, other major Unix platforms and real time operating systems. Data written on any platform can be accessed from any other supported platform. Objectivity/DB transparently handles any necessary data conversions.You can preserve your investment in older languages and platforms while upgrading to new languages and platforms.Dynamic Schema Evolution supports changing the language class definitions and recompiling the application with transparent migration of the objects, or the developer can use an Objectivity product, Active Schema, to dynamically create and modify class definitions and object instances, or the developer can even implement a meta-schema (sometimes called a schema of schema).All this allows a system to change to keep up with the dynamically changing distributed real world.
  • Big Data Market forecast from JMP Securities Industry Overview (11-15-11) page 1 of 6. Objectivity, Inc. – last 12 month growth Jan-Dec 2011 increased by 45% from 2010Profitable – 7 of last 10 yrs.

Silicon valley nosql meetup  april 2012 Silicon valley nosql meetup april 2012 Presentation Transcript

  • Maximize your Data with Real-time Big Data Analytics using NOSQL Technologies. Silicon Valley NOSQL Meetup Group Thursday, April 26, 2012 – Brian Clark5/4/2012 © Objectivity Inc 2012 1
  • Agenda • About me! • Objectivity, Inc. • NOSQL • Big Data • Use Cases • InfiniteGraph and Objectivity/DB Overview • Demo • Q&A5/4/2012 © Objectivity Inc 2012 2
  • School - The 3 R’s •Reading •wRiting •aRithmetic •I knew I was in trouble!5/4/2012 © Objectivity Inc 2012 3
  • University - The 3 B’s •Bands (Friday night Hop) •Booze •Birds •I knew I was in trouble! • = a job as a mainframe computer operator5/4/2012 © Objectivity Inc 2012 4
  • A Brief History of Computing Copyright © 2008 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved. Make One Big Computer 1970s 1980s 1990s 2000s 2010s Network Distributed NOWs Grid Cloud Operating Operating & Computing Computing Systems Systems Clusters5/4/2012 © Objectivity Inc 2012 5
  • A Brief History of Computing5/4/2012 © Objectivity Inc 2012 6
  • A Brief History of Computing5/4/2012 © Objectivity Inc 2012 7
  • A Brief History of Databases Physical Many-to-many Physical Performance pointers relationships, independence with but still too rigid Complexity SQL and Scalability Hierarchical Network Relational Object- Model Model Model Oriented 1960’s 1960’s 1970’s 1990’s5/4/2012 © Objectivity Inc 2012 8
  • Objectivity, Inc. • The world today is about big data, distributed objects and connections between them. • Objectivity/DB™ Distributed big data and object management. • InfiniteGraph™ Connects the dots on a global scale.5/4/2012 © Objectivity Inc 2012 9
  • NOSQL
  • InfiniteGraph in the “NOSQL” Market5/4/2012 © Objectivity Inc 2012 11
  • The Right Tool for the Right Job (1 of 2)First, a truism: Relational Databases• The closer the data model matches the data store • Data represented by rows (records) and columns structure, the faster queries can be executed, the (attributes); a schema defines the columns and higher the scalability, and the easier it is to write their distribution amongst tables. applications. • Versatile, can solve most data storage and access• One size doesn’t fit all, and multiple tools might problems; can solve all if scale is limited. join forces to fully solve a problem. • Good for producing lists of data based on a value in that data, such as a list of customers with unfilled orders.Hadoop/MapReduce Object Databases• General purpose parallel processing and storing • Data represented by objects, which are groups of facility for massive amounts of data. attributes; schema defines the attributes, which may include pointers (relationships) to other objects• Data store is a file system, not a database. • Ability to store and retrieve whole objects makes• Good for problems that can be broken into many access to set of data very fast; tighter connection to small parts and processed independently, and object-oriented programming application reduces done so offline, such as the ETL (extract, complexity. transform, load) process for preparing and • Good for accessing massive amounts of data about moving captured data into a data warehouse. related items, such as a user’s account history.5/4/2012 © Objectivity Inc 2012 12
  • The Right Tool for the Right Job (2 of 2)Key-Value Databases Column Family Databases• Rows and columns like a relational database, but only 2 • Rows and columns like a relational database, but storage columns, making it an indexing system (find a value based on disk is organized so as to make attributes (columns) on the key) highly accessible without accessing the whole of the associated record (row).• No schema required, so the value could be anything, such as an object or a pointer to data in another data store • Results in very fast actions regarding attributes, such as calculating average age• Very fast for indexing, such as looking up a user’s shopping cart on an ecommerce site.Document Databases Graph Databases• Similar to object database, but without the need to • Similar to object database, but the objects and predefine an object’s attributes (i.e., no schema relationships between them are all objects with their required). own respective sets of attributes.• Provides flexibility to store new types or unanticipated • Enables very fast queries when the value of the data is in sizes of data/objects during operation, on the fly, such as the relationships, i.e. relationships between event logging where the data format is unpredictable and people/items not just simple text (e.g., video). • Are two people/items related (even if separated by several levels of relationship)? • Where the relationships represent costs, what is the optimal combination of groups of people/items?5/4/2012 © Objectivity Inc 2012 13
  • Big Data
  • Big Data • Volume • Velocity • Variety = VALUE! Requires new ways of thinking – distributed data and processing5/4/2012 © Objectivity Inc 2012 15
  • Parallel Processing and Storage Apache HADOOP InfiniteGraph • Map/Reduce • Distributed processing - Peer-to-peer servers and – Distributed processing. clients anywhere in the network. • HDFS • Distributed data – Distributed file system. - Federation of databases • HBase anywhere in the network. – Distributed storage for • Standard filesystem - Random I/O for fast large tables. navigational queries. • Cassandra • Single logical view of all – Multi-master database with data in the federation - Any client anywhere can no single point of failure. access server anywhere.5/4/2012 © Objectivity Inc 2012 16
  • Common Big Data Architecture Data Aggregation & Application Analytics Commodity Linux Clusters or High Performance Compute platforms Data Column Graph Object Hadoop Key-Value DocumentRDBMS Warehouse Stores DB DB BigTable Stores DB Structured Semi-structured Un-structured 5/4/2012 © Objectivity Inc 2012 17
  • Common Big Data Architecture Visualization Other Front End and Analytics RDBMS Hadoop Raw Data stores Processing tools Act Decide Orient Observe The strategic competitors are all moving in this direction for Big Data5/4/2012 © Objectivity Inc 2012 18
  • Big Data Analytics SolutionsEMC Greenplum Data Analytics Greenplum Data Applications Greenplum Integration Raw Data Hadoop AcceleratorIBM Infosphere Infosphere IBM Front End BigInsights DB2 Processing Raw Data Warehouse HadoopOracle Oracle In- Oracle Database Oracle Oracle Cloudera Data Raw Data Analytics 11g NoSQL Hadoop IntegratorHP Vertica Front End Autonomy Raw Data Database Processing5/4/2012 © Objectivity Inc 2012 19
  • Big Data Landscape • All current solutions have the same basic architecture model. • None of the current solutions have a way to store connections between entities in the different silos. – Analytics today focuses on the nodes of data (quantifiable occurrences) rather than the relevant connections or edges between the nodes (qualitative occurrences). • Objectivity has a proven way to efficiently store, manage and query the relationships and connections between data.5/4/2012 © Objectivity Inc 2012 20
  • Disruptive Big Data New Architecture The Proven Connection Store Objectivity/DB and/or InfiniteGraph Raw Data Visualization and Analytics tools Other Front End RDBMS Hadoop Processing Raw Data stores Represents data Represents bidirectional nodes relationships/connections between data.5/4/2012 © Objectivity Inc 2012 21
  • Why We’re Different • Relational databases are not optimized to understand objects or connections. • Objectivity/DB™ is all about objects and relationships. • InfiniteGraph™ is all about the connections as first class citizens.5/4/2012 © Objectivity Inc 2012 22
  • Use Cases & Challenges
  • Relationships are everywhere Network Intelligence CRM, (Government& Sales & Mgmt, Telecom Business) Marketing PLM (Product Lifecycle Mgmt) Finance Healthcare Social Logistics Master Data Research: Networks Management Genomics5/4/2012 © Objectivity Inc 2012 24
  • Financial Services Fraud Detection – Problem: Detect patterns of fraudulent activities before damage is done – Solution: Real-time identification of inconsistencies enables instantaneous notification to security systems – Results: • Improved banking security and client confidence • Reduction of lost revenues • Improved efficiency allows fraud- detection teams to develop and deploy additional services5/4/2012 © Objectivity Inc 2012 25
  • Application Development The “Facebook” For Education – Problem: Develop system capable of handling exponential user- base growth – Solution: Leverage InfiniteGraph’s scalability and performance to support real-time relationship information between all members and to act as primary DB for all topics and users – Results: Complete social networking site allowing global users to access courses from leading institutions & to collaborate effectively with other students and teachers5/4/2012 © Objectivity Inc 2012 26
  • Use Case – Confidential Ad Placement Network • Ad placement on smart phone based on user profile and location data generated by opt-in application (e.g., a free game). • Location data captured and distilled by Cassandra (key- value/column family hybrid database). • Locations matched with geospatial data to refine user interests. • As ad placement orders arrive, InfiniteGraph matches groups of users with ads, maximizing relevance for the user, value for the advertiser and revenue for the ad placement company.5/4/2012 © Objectivity Inc 2012 27
  • Government Broad Area Maritime Surveillance UAS – Problem: Monitor potential threats across open oceans and remote areas on a 24/7 basis – Solution: Use Objectivity/db to develop a system for unmanned aircraft to capture and transmit real- time data of any type for analysis and sharing – Results: A federated view of maritime surveillance and continuous reconnaissance capability for mission, reconnaissance, and communications assessments5/4/2012 © Objectivity Inc 2012 28
  • Healthcare Bring together doctors, patients, and their records – Problem: As patients move between doctors, manage their records globally to better capture and understand symptoms, causes, and interdependencies and to improve diagnoses – Solution: Create a database using Objectivity/db and InfiniteGraph capable of managing real-time entries of patient visits, symptoms, diagnoses, reactions to medications, and progress – Results: • Improved times to more accurate diagnoses • Creation of a knowledge base of similar medical cases • Increase success rates of initial prescriptions based on historical recommendations5/4/2012 © Objectivity Inc 2012 29
  • Network Centric Collaborative Targeting Team: Objectivity, L-3, and Lockheed U.S. Air Force’s Network Centric Collaborative Targeting (NCCT) U.S. Navy’s Cooperative Engagement Capability (CEC) system. 305/4/2012 © Objectivity Inc 2012 30
  • NCCT - Customer Challenge Silo’d systems with individual reports did not provide solutions  Time sensitive targets were hard to find  Sensors operated as independent systems  The performance of each individual sensor is very good ( great ears and eyes) but collectively lack a central nervous system  Mountains of Data are coming from sensors  Existing sensors alone cannot reliably find highly mobile, moving and/or spoofing targets5/4/2012 © Objectivity Inc 2012 31
  • NCCT - Technical Solution Architecture 1. Build a distributed systems that could support multi-agency platform requirements 2. Collect data from any number of high volume sources 3. Provide a data architecture that supported the need to correlate and fuse data collection for a single view of the targets 4. Support a near real-time data reporting C4ISR system5/4/2012 © Objectivity Inc 2012 Company Confidential 32
  • Intelligence - Customer Need Collect 400,000,000 phone calls, plus address, emails, meetings…. Finding the links between callers Deliver all the possible connections between them in seconds5/4/2012 © Objectivity Inc 2012 33
  • Intelligence Problem - Performance  With a relational product:  Initial attempts to traverse links across the database literally shut down the server.  After much server and database optimization a process could be run on a single query and would produce a result over a 48 hour period.  Results were unacceptable…..  With Objectivity:  The many-to-many data application was an excellent fit for Objectivity.  We then developed a proof-of-concept that delivered showing 5-6 degrees of separation within about 1 minute, running on a laptop computer5/4/2012 © Objectivity Inc 2012 34
  • InfiniteGraph & Objectivity/DBTechnical Overview
  • What is a graph database? • Optimized around data relationships – Relationships as first class citizens – Super fast traversal between entities – Rich/flexible annotation of connections • Small focused API (typically not SQL) – Natively work with concepts of Vertex/Edge – SQL has no concept of “navigation” • Graphs grow quickly e.g. – Billions of phone calls / day in US – Emails, social media events, IP Traffic – Financial transactions • Some analytics require navigation of large sections of the graph • Each step (often) depends on the last • Must distribute data and go parallel5/4/2012 © Objectivity Inc 2012 36
  • Database Data Representation • Traditional databases are good at recording things, not events or relationships Rows/Columns/Tables Relationship/Graph Optimized Meetings Met Alice P1 P2 Place Time 5-27-10 Alice Bob Denver 5-27-10 Charlie Calls From To Time Duration Called Called Bob Carlos 13:20 25 Bob 13:20 17:10 Bob Charlie 17:10 15 Payments From To Date Amount Paid Carlos Charlie 5-12-10 100000 Carlos 1000005/4/2012 © Objectivity Inc 2012 37
  • Viewing the DataThe InfiniteGraph Visualizer will need this name to display the contents of thegraph database.5/4/2012 © Objectivity Inc 2012 38
  • ™ InfiniteGraph • Connects the dots on a global scale. • InfiniteGraph™ finds connections in big data.5/4/2012 © Objectivity Inc 2012 39
  • Find Answers Faster with InfiniteGraph™ Distributed Graph Database5/4/2012 © Objectivity Inc 2012 40
  • InfiniteGraph’s Unique Advantages • Supports large scale and distributed systems. • Proven technology and deployments. • Flexible and Easy: • Distributed and cloud ready, Java on interoperable platforms, integrates with most other data stores, supports ACID to flexible modes.5/4/2012 © Objectivity Inc 2012 41
  • InfiniteGraph Basic Architecture User Apps Blueprints InfiniteGraph - Core/API Management Navigation Session / TX Placement Configuration Extensions Execution Management Distributed Object and Relationship Persistence Layer5/4/2012 © Objectivity Inc 2012 42
  • InfiniteGraph Features • Distributed parallel ingest. • Flexible distributed storage management. • Node naming and indexing for fast lookup. • User controlled navigational queries – using node and edge filters. • Navigator plug-in architecture for sharing plug-ins with the visualizer. • InfiniteGraph Visualizer. • Blueprints support via Gremlin5/4/2012 © Objectivity Inc 2012 43
  • Objectivity/DB Basic Architecture User Application C#/.NET Java API Python API ULB C++ Public API Objy Kernel I/O Manager Page Server Lock Server Query Server (AMS)5/4/2012 © Objectivity Inc 2012 44
  • Distributed Data /Processing Distributed Federated Persistent Store Network Scale Out Scale Out SAN Distributed Data Management Federated Data Management Single Logical View All clients and servers see all data.5/4/2012 © Objectivity Inc 2012 45
  • Distributed Data Architecture Federation (schema & 64 Bit OID (Object ID) catalog) #21538 - 1874 - 9638 - 164 Container Container Database 64K 64K Database Page Container Slot Container Container • 1,000’s trillions of unique objects • 1,000’s petabytes storage • Logical/physical indirection at every segment • Resolving ID fast regardless of number of objects5/4/2012 © Objectivity Inc 2012 46
  • Distributed Processing Architecture Simple, Distributed Servers Client Lock Servers Lock Servers Cache Application Objectivity/DB Data Servers Query Agents Data Servers Data Servers Put the data and processing where it’s needed5/4/2012 © Objectivity Inc 2012 47
  • Flexibility – language interoperability Java App C++ App C# App Python App Objectivity/DB Objectivity/DB Objectivity/DB Objectivity/DB A B C D E F5/4/2012 © Objectivity Inc 2012 48
  • Flexibility – heterogeneous platforms Unix Linux (Sun, HP) Wintel Mac OSX Network Storage5/4/2012 © Objectivity Inc 2012 49
  • InfiniteGraph™ - Link Hunter demonstration5/4/2012 © Objectivity Inc 2012 50
  • Comprehensive Online Resources InfiniteGraph Developer Wiki Product Google Group Documentation for Developers InfiniteGraph.com Download (main site, Our Blog InfiniteGraph content and messaging)5/4/2012 © Objectivity Inc 2012 51
  • Company Snapshot • Established in 1988 Corporate • Headquartered in Sunnyvale, California • NOSQL platform for managing and discovering relationships between complex data • Objectivity/DB™: Object-oriented data management system that manages localized, centralized Products or distributed databases • InfiniteGraph™: New massively scalable graph database that enables organizations to find, store, and exploit the relationships hidden in their data • Big Data Market forecasted to be $11.6B in 2012, with CAGR of 28.0% over the next 5 years Market • 40% per year data growth, cloud adoption, mobile usage and improved real-time, predictiveOpportunity analytics underpin Objectivity’s growth opportunities • Strategically positioned as key Big Data enabler that pulls through servers, DBs and file stores • Deeply embedded in nearly 90 enterprises and government organizationsCustomers • Competitive advantages in Big Data with strong IP and patent position • Growing pipeline of near-term opportunities across expanding use cases • Generating increased revenues in last twelve monthsFinancials & • Profitable and cash flow positive; no debtOwnership • Ownership: Privately held by employees and venture investors 5/4/2012 52 © Objectivity Inc 2012
  • Brian Clark VP Product Marketing, Objectivity Inc. http://www.infinitegraph.com http://www.objectivity.com5/4/2012 © Objectivity Inc 2012 53