SlideShare a Scribd company logo
Big Data and the Multi-model
Database
Heli Helskyaho
DB Tech Showcase Tokyo, 2018
Copyright © Miracle Finland Oy
 Graduated from University of Helsinki (Master of Science, computer science), currently
a doctoral student, researcher and lecturer (databases, Big Data, Multi-model
Databases, methods and tools for utilizing semi-structured data for decision making) at
University of Helsinki
 Worked with Oracle products since 1993, worked for IT since 1990
 Data and Database!
 CEO for Miracle Finland Oy
 Oracle ACE Director
 Ambassador for EOUC (EMEA Oracle Users Group Community)
 Listed as one of the TOP 100 influences on IT sector in Finland (2015, 2016, 2017)
 Public speaker and an author
 Author of the book Oracle SQL Developer Data Modeler for Database Design Mastery
(Oracle Press, 2015), co-author for Real World SQL and PL/SQL: Advice from the Experts
(Oracle Press, 2016)
Introduction, Heli
Copyright © Miracle Finland Oy
Copyright © Miracle Finland Oy
References
 [1] Marcello Buoncristiano, Giansalvatore Mecca, Elisa Quintarelli, Manuel Roveri,
Donatello Santoro, Letizia Tanca: Database Challenges for Exploratory Computing.
SIGMOD Record 44(2): 17-22 (2015).
 [2] Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri: Overview of Data
Exploration Techniques. SIGMOD Conference 2015: 277-281.
 [3] Zhen Hua Liu, Dieter Gawlick: Management of Flexible Schema Data in RDBMSs -
Opportunities and Limitations for NoSQL. 7th Biennial Conference on Innovative Data
Systems Research (CIDR ’15) January 4-7, 2015, Asilomar, California, USA.
 [4] Z. H. Liu, B. Hammerschmidt, D. McMahon, Y. Liu, H.J.Chang: Closing the functional
and Performance Gap between SQL and NoSQL. SIGMOD Conference 2016: 227-238
 [5] Z. H. Liu, B. Christoph Hammerschmidt, D. McMahon: JSON data management:
supporting schema-less development in RDBMS. SIGMOD Conference 2014: 1247-1258
 [6] https://docs.oracle.com/database/122/ADJSN/json-dataguide.htm#ADJSN-GUID-
219FC30E-89A7-4189-BC36-7B961A24067C
Copyright © Miracle Finland Oy
 Solutions for storing and retrieving data
 The network data model, the hierarchical data model
…
 The relational data model
 The relational database management system (RDBMS)
 has been de facto as a data model solution for decades
 based on solid theory
 standardized environment for storing and retrieving
data
The history of data models, briefly
Copyright © Miracle Finland Oy
 of software development and programming
languages leads to new demands for data models
The evolution…
Copyright © Miracle Finland Oy
 The need to save and retrieve objects easily using
object-oriented programming languages
 -> Object databases
 -> Later these features were added to a RDBMS
making it an ORDMS (Object-Relational DataBase
Management System)
 Supported in Oracle Database
Objects
Copyright © Miracle Finland Oy
 To transfer data in a standardized way
 The need to save and retrieve XML documents easily
 XML Databases
 Later these features were added to the RDBMS
 Supported in Oracle Database (XMLType and its
methods, SQL functions, indexing, registering XML
Schema,…)
XML (Extensible Markup Language)
Copyright © Miracle Finland Oy
 The need to save and retrieve spatial data easily
 Spatial Databases
 Later these features were added to the RDBMS
 Supported in Oracle Database
Spatial
Copyright © Miracle Finland Oy
 Relational Data: ”schema first, data later”
 Flexible Schema Data (FSD): ”data first, schema
later/never”
New demand for a Data Model: FSD
Copyright © Miracle Finland Oy
 An ”umbrella” for different solutions, each for a certain
purpose/problem
 Document Stores
 Key-value pair stores
 Column stores
 Graph stores
 No ACID (Atomicity, Consistency, Isolation, and
Durability) but maybe BASE (Basically Available, Soft
state, Eventual consistency).
 Scale-up vs scale-out
NoSQL
Copyright © Miracle Finland Oy
 JSON for Javascript programming, lighter version of
XML
 The need to save and retrieve JSON (Javascript
Notation) documents easily
 Supported in Oracle Database (CLOB, NCLOB)
Document Stores
Copyright © Miracle Finland Oy
 something between semi-structured and text, key is
structured and value is text
 Supported in Oracle Database (+Oracle NoSQL)
Key-value pair stores
Copyright © Miracle Finland Oy
 Data stored in columns, not rows
 Supported in Oracle Database (in-memory column
store)
Column stores
Copyright © Miracle Finland Oy
 About relationships
 Nodes/vertices, links/edges and properties
 Each vertix and each edge has a collection of key-value
properties
 A graph can be created from a relational model quite easily
 Supported in Oracle Database
 To me the most interesting of these NoSQL solutions….
 But worth a presentation of its own
Graph stores
Copyright © Miracle Finland Oy
 Slowly these features have been added to the
RDBMS…
 But can this continue?
 New feature-> new db-> added to RDBMS
NoSQL
Copyright © Miracle Finland Oy
Key Findings
 Operational DBMS use cases have expanded beyond traditional transactions
— still the leading revenue generator — to encompass three additional use
cases: distributed variable data; lightweight events and observations; and
hybrid transactional/analytical processing (HTAP).
 Relational operational DBMS products and emerging NoSQL and multimodel
offerings are entering the market to address some of these new use cases,
creating opportunities for best-fit engineering, but threatening established
company standards for information management.
 Incumbent vendors are responding by expanding the capabilities of their
relational database management systems (RDBMSs) into multimodel
territory, but hedging their bets by also adding specialized products to their
portfolios, creating both multimodel and multiproduct choices — and
confusion in the minds of buyers.
Gartner, Critical Capabilities for Operational Database
Management Systems, Dec 9th 2015
Copyright © Miracle Finland Oy
Recommendations
 Identify the use cases you have under consideration, and assess your current
operational DBMS products' fit, costs, deployment options and skills
requirements.
 Evaluate potential alternative product selections based on best-fit
engineering that maps to your use cases — don't pay for features you don't
and will not need. Use this Critical Capabilities document alongside its
companion Magic Quadrant to identify candidates.
 Issue RFPs (using Gartner's RFP Toolkit templates) and conduct proof of
concept (POC) exercises to confirm and validate candidates.
 Update your standards, training and support organizations to manage, qualify
and accommodate additional requirements if your technology portfolio is
expanding.
Gartner, Critical Capabilities for Operational Database
Management Systems, Dec 9th 2015
Copyright © Miracle Finland Oy
Strategic Planning Assumptions
 By 2017, all leading operational DBMSs will offer multiple data
models, relational and NoSQL, in a single DBMS platform.
 Through 2018, 50% of small operational DBMS vendors will
disappear due to acquisitions, mergers or business failures.
 By 2017, the "NoSQL" label will cease to distinguish DBMSs,
reducing its value and resulting in the label falling out of use.
Gartner, Critical Capabilities for Operational Database
Management Systems, Dec 9th 2015
Copyright © Miracle Finland Oy
 Brings even more than just the need for new data
models…
The Big Data…
Copyright © Miracle Finland Oy
 There is no size that makes a data to be ”Big Data”, it
always depends on the capabilities
 The data is ”Big” when traditional processing with
traditional tools is not possible due to the amount or
the complexity of the data
 You cannot open an attachement in email
 You cannot edit a photo
 etc.
What is Big Data?
Copyright © Miracle Finland Oy
 Volume, the size/scale of the data
 Velocity, the speed of change, analysis of streaming
data
 Variety, different formats of data sources, different
forms of data; structured, semi-structured,
unstructured
The three V’s
Copyright © Miracle Finland Oy
 Veracity, the uncertainty of the data, the data is worthless or
harmful if it’s not accurate
 Viability, validate that hypothesis before taking further action
(and, in the process of determining the viability of a variable,
we can expand our view to determine other variables)
 Value, the potential value
 Variability, refers to data whose meaning is constantly
changing, in consistency of data; for example words and
context
 Visualization, a way of presenting the data in a manner that’s
readable and accessible
The other V’s
Copyright © Miracle Finland Oy
 Are easy to understand and there are plenty of
technical solutions to solve the problems but…
Volume and Velocity
Copyright © Miracle Finland Oy
Variety leads to…
 -> diversity, complexity, …
 Users can not query all of their data using a single high level
declarative query language. They have to
 use different query languages for querying different data
 implement their own join algorithms to join between
relational data and FSD
 Also many specialized systems lack essential advanced
functionalities, such as ACID and fine grain security, that
are the “norm” in RDBMSs
Copyright © Miracle Finland Oy
Data Exploration
 Data exploration is to efficiently extract knowledge from
data, even though you do not know what you are looking
for.
 Exploratory computing: a conversation between a user and
a computer.
 The system provides active support
 The exploration process is investigation, exploration-
seeking, comparison-making, and learning
 Wide range of different techniques is needed (the use of
statistics, data analysis, query suggestion, advanced
visualization tools, etc.)
Copyright © Miracle Finland Oy
Not a new thing…
 J. W. Tukey. Exploratory data analysis. Addison-
Wesley, Reading,MA, 1977:
 “with exploratory data analysis the researcher explores
the data in many possible ways, including the use of
graphical tools like boxplots or histograms, gaining
knowledge from the way data are displayed”
Copyright © Miracle Finland Oy
Old problems and new problems
 More and more data (volume)
 Different data models and formats (variety)
 Loading in progress while data exploration going on
(velocity)
 Not all data is reliable (veracity)
 We do not know what we are looking for (value, viability,
variability)
 Must support also non-technical users (journalists,
investors, politicians,…) (visualization)
 All must be done efficiently and fast
Copyright © Miracle Finland Oy
Different Layers
 the user interaction/user interface
 the middleware
 the database layer/engine
Copyright © Miracle Finland Oy
Challenges for the User Interaction
 The user interface should help the user to find information
she/he was not aware of and to find even more information
based on the information already found.
 The data is in different formats, it is a large amount of
different kinds of data and analyzing it takes time.
 The conversation must be fluently responsive.
 Not all users are IT professionals
 The query result visualization to understand the data and to
communicate with the exploration software would be
valuable
Copyright © Miracle Finland Oy
User Interaction,
Query Visualization
 The role of visualization for data analytics is huge (a
picture tells more than 1000 words)
 visualization tools that assist users in navigating the
underlying data structures
 tools that incorporate new types of interactions such
as collaborative annotations and searches
 Query result reduction (aggregation, sampling and/or
filtering operations)
 Query result visualization
Copyright © Miracle Finland Oy
User Interaction,
User Involvement
 the relevance of an answer for a specific user
 predict users’ interests and anticipations in order to issue
the most relevant answer
 Some possible techniques
 Grouping the users (subgroups)
 classify different notions of relevance
 devise strategies to customize the system response to
the user behavior
Copyright © Miracle Finland Oy
Middleware
 Building various optimizations on top of the database
layer/engine
 Can be used to improve the efficiency of the data
exploration without changing the underlying
architecture
 The exploration tasks are computationally heavy and
can be speedup in the middleware layer with
 Data Prefetching (and background execution)
 Query Approximation
Copyright © Miracle Finland Oy
Middleware, Data Prefetching
 caching data sets which are likely to be used
 the trade-off between introducing new results and
re-using cached ones
 the main challenge is identifying the data set with
the highest utility
Copyright © Miracle Finland Oy
Middleware, Query Approximation
 The system offers approximate answers
 Allows users to get a quick sense of whether a
particular query reveals anything interesting about
the data
 Need solutions that process queries on sampled
data sets to provide fast query response times
 the trade-off between results accuracy and query
performance
Copyright © Miracle Finland Oy
Database Engine/Layer, Solutions
1. Adaptive Indexing
2. Adaptive Loading
3. Adaptive Storage
4. Flexible Architecture
5. Architectures tailored for approximation processing
Copyright © Miracle Finland Oy
Adaptive Indexing
 creating indexes incrementally and adaptively during
query processing based on the columns, tables and
value ranges that queries request
 Indexes are built gradually; as more queries arrive
indexes are continuously fine-tuned
Copyright © Miracle Finland Oy
Adaptive Loading
 Not all data is needed; users might want to leave
some parts of the data unloaded
 Not all data is available; users might start querying
a database system before all data is loaded
Copyright © Miracle Finland Oy
Adaptive Storage
 There is no one perfect solution for storage layout for all
the data.
 Adaptive Storage can be used for different needs of
storage layouts
 Examples
 OctopusDB (a central log and Storage Views)
 H2O (supports multiple storage layouts and data access
patterns, decides during query processing, which design is
best for classes of queries)
 In-memory columnar stores (for example Oracle)
 real-time materialized views (Oracle 12c rel 2)
Copyright © Miracle Finland Oy
Flexible Architecture
 can tune the architecture to the task at hand
 by having a declarative interface for the physical data
layouts
 by having a declarative interface for the whole engine
 using organic databases to continuously match incoming
data and queries
 Start with a schema that is ”good-enough”, schema
evolution while loading and querying
Copyright © Miracle Finland Oy
Architectures tailored for
approximation processing
 An architecture where storage and access patterns are
tailored to efficiently support sampling-based query
processing
 efficient access and updates of sampled data sets
 In the DB core
 For example Oracle Database 12c: Adaptive Dynamic
Sampling (for the optimizer)
 DICE (Distributed and Interactive Cube Exploration) faceted
exploration of data cubes; combines sampling with
speculative execution in the same engine
 Design new engines is one of the open challenges.
Copyright © Miracle Finland Oy
Techniques for Large Data Size
 Summarizing: fully pre-computing is not possible and not smart
 The size of data summarized must be large enough to estimate
statistical parameters and distributions, but manageable from the
computation viewpoint
 Data mining techniques
 Histograms, presenting a histogram algebra for efficiently
executing queries on the histograms instead of on the data,
and estimating the quality of the approximate answers
 Effective pruning techniques, not all node-paths are interesting
from the user viewpoint, and the ones that fail to satisfy this
requirement should be discarded as soon as possible.
Copyright © Miracle Finland Oy
 Two possibilities:
 Polyglot persistence
 Multimodel database
Different Data Models
Copyright © Miracle Finland Oy
 Different kinds of data models are best dealt with
different data stores
 Data is stored using multiple data storage
technologies to support better the usage of the data
(applications, components)
Polyglot Persistence
Copyright © Miracle Finland Oy
 Even using Oracle technology you can build a polyglot
solution:
 Berkeley DB as a Key-Value store
 Oracle NoSQL Database as a Key-Value and sharded
database
 OracleTimesTen as an In-Memory Database
 Essbase for analytic processing
 …
Oracle and Polyglot Persistence
Copyright © Miracle Finland Oy
 Different query languages
 Different frontends/user interfaces
 How to join data?
 Is data consistents?
 Is data secured?
 Many different skills needed
 …
What’s wrong with polyglot
Copyright © Miracle Finland Oy
Polyglot vs. Multimodel
Copyright © Miracle Finland Oy
 A multi-model database is designed to support
multiple data models against a single, integrated
backend
 One query language
 All data available
 ACID (or any level needed), security, backup/recovery,
…
 All the ”good” from relational added with all the
”good” from everywhere else
Multi-model database
Copyright © Miracle Finland Oy
How to save and how to retrive the
data?
Copyright © Miracle Finland Oy
Database Engine/Layer
 A database is all about saving and retrieving data
 The way the data is stored defines the best possible
ways of accessing it, the way data is accessed defines
the best way to store it…
 We might not know what data we will be loading
 We do not know what we are looking for
-> therefore are unable to define the ”best” way of storing
 how can the architecture of database systems be
redesigned to aid data exploration tasks?
Copyright © Miracle Finland Oy
 Efficiently, Reliably
 Understanding the data while saving
 finding the PK to be able to join to other data
 Taxonomies to understand that first name = fname?
 Metadata Management
 …
 Categorizing
 Some data is more valuable than other
 some data is more reliable than other
 not all the data want to be stored in the same, operational database
 data classification (reliable/no reliable)
Saving
Copyright © Miracle Finland Oy
 The data need to be classified before storing and the
qualification might affect where and on which format the
data is stored.
 Maybe transforming?
 Usually the disk space in RDBMS is expensive
 -> new ways to store the data (Hadoop,…)
 What data to save close to what and why
 todays spam is tomorrows ham (possibility to re-
categorize)
 How to handle date datatype?
 …
Saving
Copyright © Miracle Finland Oy
 Quite often the solution for saving a new data model
is CLOB, BLOB or external table
 But we need more than that…
Oracle Database
Copyright © Miracle Finland Oy
Interesting solutions in Oracle
Database
Copyright © Miracle Finland Oy
 save as it is, use it with SQL
 Now only JSON, something more in the future?
 “write without schema, read with schema”
JSON DataGuide in Oracle Database
12cR2
Copyright © Miracle Finland Oy
 Enabling FSDM (Flexible Schema Data Management) in a
RDBMS
 Storing JSON objects as aggregated, self-described entities
without shredding them into relational rows and columns
 Indexing JSON using a schema-agnostic strategy to support
ad-hoc queries that search both schema and values
together
 Querying JSON using SQL as the inter-document query
language and SQL/JSON path language as the intra-
document query language
JSON DataGuide in Oracle Database
12cR2 and on
Copyright © Miracle Finland Oy
 DataGuide information is part of the JSON search index
infrastructure
 CREATE SEARCH INDEX FOR JSON
 Can be created for index only, DataGuide only, both (the
default is both)
 database privilege CTXAPP needed
 The content of DataGuide is automatically updated
whenever the index is synchronized.
 To obtain the DataGuide information stored in a JSON
search index use PL/SQL function
DBMS_JSON.get_index_dataguide
Creating a DataGuide, Persistent
Copyright © Miracle Finland Oy
 When the index is synced, additive updates to the document
set are automatically reflected in the persisted DataGuide
information
 Deletions are not updated
 you must create the index again if the data of deletion is needed
 can also include statistics (how frequently each JSON field is
used in the document set)
 Explicitly gathered (EXEC
DBMS_STATS.gather_index_stats(username,
'json_doc_search_idx', estimate_percent => 90);)
 Not updated automatically
JSON search index and Data Guide
Copyright © Miracle Finland Oy
 There are two formats for a DataGuide (both defined as
CLOB):
 Flat
 to query DataGuide information such as field frequencies and
types
 Hierarchical
 to add virtual columns, or to create a view using particular fields
that you choose on the basis of DataGuide information
 (DBMS.JSON.FORMAT_FLAT or
DBMS.JSON.FORMAT_HIERARCHICAL)
Formats of DataGuide
Copyright © Miracle Finland Oy
 Based on DataGuide information for a JSON column, you
can project scalar fields as virtual columns to the same
table that contains this JSON column
 From
 hierarchical DataGuide or
 DataGuide enabled JSON search index
 DBMS_JSON.add_virtual_columns ,
DBMS_JSON.drop_virtual_columns,
DBMS_JSON.rename_column
 can decide which columns should be projected based on the
frequency of the elements in the documents stored in the
JSON column (frequency => )
DataGuide and a Virtual Column
Copyright © Miracle Finland Oy
 To improve the performance you can
 Build an index on it
 Gather statistics on it for the optimizer
 Load it into the In-Memory Column Store
DataGuide and a Virtual Column
Copyright © Miracle Finland Oy
 Relational views on top of a set of JSON files for
querying the data as it were relational
 still only saved once and as a JSON document
 You can create multiple views based on the same
JSON document set, projecting different fields
JSON DataGuide, views
Copyright © Miracle Finland Oy
 Based on a Hierarchical DataGuide
 The fields projected are those in the DataGuide
 You can edit the DataGuide to include only the fields that you want
to project.
 DBMS_JSON.create_view, DBMS_JSON. create_view_on_path
(create the view based on the frequency of the elements)
 Based on a Path Expression
 You can use the information in a DataGuide enabled JSON search
index to create a database view
 Columns project JSON fields from your documents
 The fields projected are the scalar fields and the scalar fields in the
data targeted by a specified SQL/JSON path expression.
Creating views over JSON data
Copyright © Miracle Finland Oy
CREATE TABLE json_documents (
id RAW(16) NOT NULL,
data CLOB,
CONSTRAINT json_documents_pk PRIMARY KEY (id),
CONSTRAINT json_documents_json_chk CHECK (data IS
JSON)
);
Creating views over JSON data, the
table
Copyright © Miracle Finland Oy
BEGIN
DBMS_JSON.create_view_on_path(
viewname => 'json_documents_view',
tablename => 'json_documents',
jcolname => 'data',
path => '$',
frequency => 50);
END;
/
Creating views over JSON data
Copyright © Miracle Finland Oy
DESC json_documents_view
Name Null? Type
----------------------------------------- -------- ----------------------------
ID NOT NULL RAW(16)
FIRST_NAME VARCHAR2(30)
LAST_NAME VARCHAR2(30)
STREET VARCHAR2(50)
POSTCODE VARCHAR2(10)
CITY VARCHAR2(30)
COUNTRY VARCHAR2(100)
EMAIL VARCHAR2(100)
PHONE VARCHAR2(20)
Creating views over JSON data, the
view
Copyright © Miracle Finland Oy
 It enables FSDM (Flexible Schema Data Management)
in a RDBMS
 And as a solution might be used in other cases for
different semi-structured data models in the future…
Why DataGuide is interesting?
Copyright © Miracle Finland Oy
 Save as it is, copy to a faster data model kept in-
memory
 Now only columnar store
In-Memory
Copyright © Miracle Finland Oy
 Very interesting for retrieving data…
 Saving data as its original data model and showing a copy of it
on an in-memory version of another data model when needed
 Original data as it is and a copy of it created in another data
model when needed
 Fast (relative term ☺ )
 Synchronized automatically
 No additional storage needed
 The version of a data model is used that works best for the need
 Maybe other data models could be stored in-memory too?
Graph?
In-Memory
Copyright © Miracle Finland Oy
 So many data models
 The network data model
 The hierarchical data model
 The relational data model
 Objects
 XML
 Spatial
 NoSQL models
 …
 Flexible Schema Data (FSD): ”data first, schema
later/never”
Conclusion
Copyright © Miracle Finland Oy
 Several V’s related to Big Data…
 Volume
 Velocity
 Variety
 Veracity
 Viability
 Value
 Variability
 Visualization
 …
Conclusion
Copyright © Miracle Finland Oy
Conclusion
 Data exploration is about finding new information and
knowledge from the data without knowing what you
are looking for. And it brings a lot of pressure for the
performance…in all levels:
 the user interaction/user interface
 the middleware
 the database layer/engine
Copyright © Miracle Finland Oy
Conclusion
 Various data models, solutions
 Polyglot
 Multimodel database
 Promising solutions in Oracle Database
 JSON DataGuide in Oracle Database 12cR2
 In-Memory
 Plenty of research going on in this area and results
needed soon…
THANK YOU!
QUESTIONS?
Email: heli@miracleoy.fi
Twitter: @HeliFromFinland
Blog: Helifromfinland.com

More Related Content

What's hot

Data Virtualization: From Zero to Hero
Data Virtualization: From Zero to HeroData Virtualization: From Zero to Hero
Data Virtualization: From Zero to Hero
Denodo
 
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESBData Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Denodo
 
Data Warehouse Logical Design Guide
Data Warehouse Logical Design GuideData Warehouse Logical Design Guide
Data Warehouse Logical Design Guide
Andy Yuan
 
Technical Demonstration - Denodo Platform 7.0
Technical Demonstration - Denodo Platform 7.0Technical Demonstration - Denodo Platform 7.0
Technical Demonstration - Denodo Platform 7.0
Denodo
 
Logical Data Fabric: Architectural Components
Logical Data Fabric: Architectural ComponentsLogical Data Fabric: Architectural Components
Logical Data Fabric: Architectural Components
Denodo
 
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Denodo
 
Data management
Data managementData management
Data management
RahulJoshi975765
 
A Big Data Journey
A Big Data JourneyA Big Data Journey
A Big Data Journey
Paul Boal
 
Performance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and morePerformance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and more
Denodo
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Denodo
 
Flash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lonFlash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lon
Jeffrey T. Pollock
 
Future of Data Strategy
Future of Data StrategyFuture of Data Strategy
Future of Data Strategy
Denodo
 
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Denodo
 
Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by Denodo
Justo Hidalgo
 
Ten Pillars of World Class Data Virtualization
Ten Pillars of World Class Data VirtualizationTen Pillars of World Class Data Virtualization
Ten Pillars of World Class Data Virtualization
Denodo
 
The Need to Know for Information Architects: Big Data to Big Information
The Need to Know for Information Architects: Big Data to Big InformationThe Need to Know for Information Architects: Big Data to Big Information
The Need to Know for Information Architects: Big Data to Big InformationDATAVERSITY
 
Big MDM Part 2: Using a Graph Database for MDM and Relationship Management
Big MDM Part 2: Using a Graph Database for MDM and Relationship ManagementBig MDM Part 2: Using a Graph Database for MDM and Relationship Management
Big MDM Part 2: Using a Graph Database for MDM and Relationship Management
Caserta
 
Supporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data VirtualizationSupporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data Virtualization
Denodo
 
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and Roadmap
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and RoadmapDenodo DataFest 2016: What’s New in Denodo Platform – Demo and Roadmap
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and Roadmap
Denodo
 
Data Modeling Presentations I
Data Modeling Presentations IData Modeling Presentations I
Data Modeling Presentations Icd_crisci
 

What's hot (20)

Data Virtualization: From Zero to Hero
Data Virtualization: From Zero to HeroData Virtualization: From Zero to Hero
Data Virtualization: From Zero to Hero
 
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESBData Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
 
Data Warehouse Logical Design Guide
Data Warehouse Logical Design GuideData Warehouse Logical Design Guide
Data Warehouse Logical Design Guide
 
Technical Demonstration - Denodo Platform 7.0
Technical Demonstration - Denodo Platform 7.0Technical Demonstration - Denodo Platform 7.0
Technical Demonstration - Denodo Platform 7.0
 
Logical Data Fabric: Architectural Components
Logical Data Fabric: Architectural ComponentsLogical Data Fabric: Architectural Components
Logical Data Fabric: Architectural Components
 
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
 
Data management
Data managementData management
Data management
 
A Big Data Journey
A Big Data JourneyA Big Data Journey
A Big Data Journey
 
Performance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and morePerformance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and more
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
 
Flash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lonFlash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lon
 
Future of Data Strategy
Future of Data StrategyFuture of Data Strategy
Future of Data Strategy
 
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
 
Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by Denodo
 
Ten Pillars of World Class Data Virtualization
Ten Pillars of World Class Data VirtualizationTen Pillars of World Class Data Virtualization
Ten Pillars of World Class Data Virtualization
 
The Need to Know for Information Architects: Big Data to Big Information
The Need to Know for Information Architects: Big Data to Big InformationThe Need to Know for Information Architects: Big Data to Big Information
The Need to Know for Information Architects: Big Data to Big Information
 
Big MDM Part 2: Using a Graph Database for MDM and Relationship Management
Big MDM Part 2: Using a Graph Database for MDM and Relationship ManagementBig MDM Part 2: Using a Graph Database for MDM and Relationship Management
Big MDM Part 2: Using a Graph Database for MDM and Relationship Management
 
Supporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data VirtualizationSupporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data Virtualization
 
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and Roadmap
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and RoadmapDenodo DataFest 2016: What’s New in Denodo Platform – Demo and Roadmap
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and Roadmap
 
Data Modeling Presentations I
Data Modeling Presentations IData Modeling Presentations I
Data Modeling Presentations I
 

Similar to [db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Database』

[db tech showcase Tokyo 2018] #dbts2018 #B36 『Design Your Databases straight ...
[db tech showcase Tokyo 2018] #dbts2018 #B36 『Design Your Databases straight ...[db tech showcase Tokyo 2018] #dbts2018 #B36 『Design Your Databases straight ...
[db tech showcase Tokyo 2018] #dbts2018 #B36 『Design Your Databases straight ...
Insight Technology, Inc.
 
Master Meta Data
Master Meta DataMaster Meta Data
Master Meta Data
Digikrit
 
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
Denodo
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
Ashraf Uddin
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
Denodo
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dataconomy Media
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
DataWorks Summit
 
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesData Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Denodo
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management Challenges
DataWorks Summit
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
Denodo
 
Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011Itay Braun
 
Oracle databáze – Konsolidovaná Data Management Platforma
Oracle databáze – Konsolidovaná Data Management PlatformaOracle databáze – Konsolidovaná Data Management Platforma
Oracle databáze – Konsolidovaná Data Management Platforma
MarketingArrowECS_CZ
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow Presentation
Denodo
 
Benefits of a data lake
Benefits of a data lake Benefits of a data lake
Benefits of a data lake
Sun Technologies
 
Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL Technologies
Amit Singh
 
Exploiting Data Lakes: Architecture, Capabilities & Future
Exploiting Data Lakes: Architecture, Capabilities & FutureExploiting Data Lakes: Architecture, Capabilities & Future
Exploiting Data Lakes: Architecture, Capabilities & Future
Agilisium Consulting
 
Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)
Denodo
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Denodo
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloud
James Serra
 
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
MarketingArrowECS_CZ
 

Similar to [db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Database』 (20)

[db tech showcase Tokyo 2018] #dbts2018 #B36 『Design Your Databases straight ...
[db tech showcase Tokyo 2018] #dbts2018 #B36 『Design Your Databases straight ...[db tech showcase Tokyo 2018] #dbts2018 #B36 『Design Your Databases straight ...
[db tech showcase Tokyo 2018] #dbts2018 #B36 『Design Your Databases straight ...
 
Master Meta Data
Master Meta DataMaster Meta Data
Master Meta Data
 
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
 
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesData Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management Challenges
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
 
Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011
 
Oracle databáze – Konsolidovaná Data Management Platforma
Oracle databáze – Konsolidovaná Data Management PlatformaOracle databáze – Konsolidovaná Data Management Platforma
Oracle databáze – Konsolidovaná Data Management Platforma
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow Presentation
 
Benefits of a data lake
Benefits of a data lake Benefits of a data lake
Benefits of a data lake
 
Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL Technologies
 
Exploiting Data Lakes: Architecture, Capabilities & Future
Exploiting Data Lakes: Architecture, Capabilities & FutureExploiting Data Lakes: Architecture, Capabilities & Future
Exploiting Data Lakes: Architecture, Capabilities & Future
 
Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloud
 
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
 

More from Insight Technology, Inc.

グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?
Insight Technology, Inc.
 
Docker and the Oracle Database
Docker and the Oracle DatabaseDocker and the Oracle Database
Docker and the Oracle Database
Insight Technology, Inc.
 
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Insight Technology, Inc.
 
事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明する事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明する
Insight Technology, Inc.
 
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
Insight Technology, Inc.
 
MBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごとMBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごと
Insight Technology, Inc.
 
グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?
Insight Technology, Inc.
 
DBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォームDBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォーム
Insight Technology, Inc.
 
SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門
Insight Technology, Inc.
 
Lunch & Learn, AWS NoSQL Services
Lunch & Learn, AWS NoSQL ServicesLunch & Learn, AWS NoSQL Services
Lunch & Learn, AWS NoSQL Services
Insight Technology, Inc.
 
db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉 db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉
Insight Technology, Inc.
 
db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也
Insight Technology, Inc.
 
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
Insight Technology, Inc.
 
難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?
Insight Technology, Inc.
 
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Insight Technology, Inc.
 
そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?
Insight Technology, Inc.
 
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
Insight Technology, Inc.
 
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。 複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
Insight Technology, Inc.
 
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Insight Technology, Inc.
 
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
Insight Technology, Inc.
 

More from Insight Technology, Inc. (20)

グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?
 
Docker and the Oracle Database
Docker and the Oracle DatabaseDocker and the Oracle Database
Docker and the Oracle Database
 
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
 
事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明する事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明する
 
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
 
MBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごとMBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごと
 
グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?
 
DBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォームDBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォーム
 
SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門
 
Lunch & Learn, AWS NoSQL Services
Lunch & Learn, AWS NoSQL ServicesLunch & Learn, AWS NoSQL Services
Lunch & Learn, AWS NoSQL Services
 
db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉 db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉
 
db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也
 
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
 
難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?
 
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
 
そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?
 
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
 
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。 複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
 
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
 
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
 

Recently uploaded

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
UiPathCommunity
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 

Recently uploaded (20)

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 

[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Database』

  • 1. Big Data and the Multi-model Database Heli Helskyaho DB Tech Showcase Tokyo, 2018
  • 2. Copyright © Miracle Finland Oy  Graduated from University of Helsinki (Master of Science, computer science), currently a doctoral student, researcher and lecturer (databases, Big Data, Multi-model Databases, methods and tools for utilizing semi-structured data for decision making) at University of Helsinki  Worked with Oracle products since 1993, worked for IT since 1990  Data and Database!  CEO for Miracle Finland Oy  Oracle ACE Director  Ambassador for EOUC (EMEA Oracle Users Group Community)  Listed as one of the TOP 100 influences on IT sector in Finland (2015, 2016, 2017)  Public speaker and an author  Author of the book Oracle SQL Developer Data Modeler for Database Design Mastery (Oracle Press, 2015), co-author for Real World SQL and PL/SQL: Advice from the Experts (Oracle Press, 2016) Introduction, Heli
  • 3. Copyright © Miracle Finland Oy
  • 4. Copyright © Miracle Finland Oy References  [1] Marcello Buoncristiano, Giansalvatore Mecca, Elisa Quintarelli, Manuel Roveri, Donatello Santoro, Letizia Tanca: Database Challenges for Exploratory Computing. SIGMOD Record 44(2): 17-22 (2015).  [2] Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri: Overview of Data Exploration Techniques. SIGMOD Conference 2015: 277-281.  [3] Zhen Hua Liu, Dieter Gawlick: Management of Flexible Schema Data in RDBMSs - Opportunities and Limitations for NoSQL. 7th Biennial Conference on Innovative Data Systems Research (CIDR ’15) January 4-7, 2015, Asilomar, California, USA.  [4] Z. H. Liu, B. Hammerschmidt, D. McMahon, Y. Liu, H.J.Chang: Closing the functional and Performance Gap between SQL and NoSQL. SIGMOD Conference 2016: 227-238  [5] Z. H. Liu, B. Christoph Hammerschmidt, D. McMahon: JSON data management: supporting schema-less development in RDBMS. SIGMOD Conference 2014: 1247-1258  [6] https://docs.oracle.com/database/122/ADJSN/json-dataguide.htm#ADJSN-GUID- 219FC30E-89A7-4189-BC36-7B961A24067C
  • 5. Copyright © Miracle Finland Oy  Solutions for storing and retrieving data  The network data model, the hierarchical data model …  The relational data model  The relational database management system (RDBMS)  has been de facto as a data model solution for decades  based on solid theory  standardized environment for storing and retrieving data The history of data models, briefly
  • 6. Copyright © Miracle Finland Oy  of software development and programming languages leads to new demands for data models The evolution…
  • 7. Copyright © Miracle Finland Oy  The need to save and retrieve objects easily using object-oriented programming languages  -> Object databases  -> Later these features were added to a RDBMS making it an ORDMS (Object-Relational DataBase Management System)  Supported in Oracle Database Objects
  • 8. Copyright © Miracle Finland Oy  To transfer data in a standardized way  The need to save and retrieve XML documents easily  XML Databases  Later these features were added to the RDBMS  Supported in Oracle Database (XMLType and its methods, SQL functions, indexing, registering XML Schema,…) XML (Extensible Markup Language)
  • 9. Copyright © Miracle Finland Oy  The need to save and retrieve spatial data easily  Spatial Databases  Later these features were added to the RDBMS  Supported in Oracle Database Spatial
  • 10. Copyright © Miracle Finland Oy  Relational Data: ”schema first, data later”  Flexible Schema Data (FSD): ”data first, schema later/never” New demand for a Data Model: FSD
  • 11. Copyright © Miracle Finland Oy  An ”umbrella” for different solutions, each for a certain purpose/problem  Document Stores  Key-value pair stores  Column stores  Graph stores  No ACID (Atomicity, Consistency, Isolation, and Durability) but maybe BASE (Basically Available, Soft state, Eventual consistency).  Scale-up vs scale-out NoSQL
  • 12. Copyright © Miracle Finland Oy  JSON for Javascript programming, lighter version of XML  The need to save and retrieve JSON (Javascript Notation) documents easily  Supported in Oracle Database (CLOB, NCLOB) Document Stores
  • 13. Copyright © Miracle Finland Oy  something between semi-structured and text, key is structured and value is text  Supported in Oracle Database (+Oracle NoSQL) Key-value pair stores
  • 14. Copyright © Miracle Finland Oy  Data stored in columns, not rows  Supported in Oracle Database (in-memory column store) Column stores
  • 15. Copyright © Miracle Finland Oy  About relationships  Nodes/vertices, links/edges and properties  Each vertix and each edge has a collection of key-value properties  A graph can be created from a relational model quite easily  Supported in Oracle Database  To me the most interesting of these NoSQL solutions….  But worth a presentation of its own Graph stores
  • 16. Copyright © Miracle Finland Oy  Slowly these features have been added to the RDBMS…  But can this continue?  New feature-> new db-> added to RDBMS NoSQL
  • 17. Copyright © Miracle Finland Oy Key Findings  Operational DBMS use cases have expanded beyond traditional transactions — still the leading revenue generator — to encompass three additional use cases: distributed variable data; lightweight events and observations; and hybrid transactional/analytical processing (HTAP).  Relational operational DBMS products and emerging NoSQL and multimodel offerings are entering the market to address some of these new use cases, creating opportunities for best-fit engineering, but threatening established company standards for information management.  Incumbent vendors are responding by expanding the capabilities of their relational database management systems (RDBMSs) into multimodel territory, but hedging their bets by also adding specialized products to their portfolios, creating both multimodel and multiproduct choices — and confusion in the minds of buyers. Gartner, Critical Capabilities for Operational Database Management Systems, Dec 9th 2015
  • 18. Copyright © Miracle Finland Oy Recommendations  Identify the use cases you have under consideration, and assess your current operational DBMS products' fit, costs, deployment options and skills requirements.  Evaluate potential alternative product selections based on best-fit engineering that maps to your use cases — don't pay for features you don't and will not need. Use this Critical Capabilities document alongside its companion Magic Quadrant to identify candidates.  Issue RFPs (using Gartner's RFP Toolkit templates) and conduct proof of concept (POC) exercises to confirm and validate candidates.  Update your standards, training and support organizations to manage, qualify and accommodate additional requirements if your technology portfolio is expanding. Gartner, Critical Capabilities for Operational Database Management Systems, Dec 9th 2015
  • 19. Copyright © Miracle Finland Oy Strategic Planning Assumptions  By 2017, all leading operational DBMSs will offer multiple data models, relational and NoSQL, in a single DBMS platform.  Through 2018, 50% of small operational DBMS vendors will disappear due to acquisitions, mergers or business failures.  By 2017, the "NoSQL" label will cease to distinguish DBMSs, reducing its value and resulting in the label falling out of use. Gartner, Critical Capabilities for Operational Database Management Systems, Dec 9th 2015
  • 20. Copyright © Miracle Finland Oy  Brings even more than just the need for new data models… The Big Data…
  • 21. Copyright © Miracle Finland Oy  There is no size that makes a data to be ”Big Data”, it always depends on the capabilities  The data is ”Big” when traditional processing with traditional tools is not possible due to the amount or the complexity of the data  You cannot open an attachement in email  You cannot edit a photo  etc. What is Big Data?
  • 22. Copyright © Miracle Finland Oy  Volume, the size/scale of the data  Velocity, the speed of change, analysis of streaming data  Variety, different formats of data sources, different forms of data; structured, semi-structured, unstructured The three V’s
  • 23. Copyright © Miracle Finland Oy  Veracity, the uncertainty of the data, the data is worthless or harmful if it’s not accurate  Viability, validate that hypothesis before taking further action (and, in the process of determining the viability of a variable, we can expand our view to determine other variables)  Value, the potential value  Variability, refers to data whose meaning is constantly changing, in consistency of data; for example words and context  Visualization, a way of presenting the data in a manner that’s readable and accessible The other V’s
  • 24. Copyright © Miracle Finland Oy  Are easy to understand and there are plenty of technical solutions to solve the problems but… Volume and Velocity
  • 25. Copyright © Miracle Finland Oy Variety leads to…  -> diversity, complexity, …  Users can not query all of their data using a single high level declarative query language. They have to  use different query languages for querying different data  implement their own join algorithms to join between relational data and FSD  Also many specialized systems lack essential advanced functionalities, such as ACID and fine grain security, that are the “norm” in RDBMSs
  • 26. Copyright © Miracle Finland Oy Data Exploration  Data exploration is to efficiently extract knowledge from data, even though you do not know what you are looking for.  Exploratory computing: a conversation between a user and a computer.  The system provides active support  The exploration process is investigation, exploration- seeking, comparison-making, and learning  Wide range of different techniques is needed (the use of statistics, data analysis, query suggestion, advanced visualization tools, etc.)
  • 27. Copyright © Miracle Finland Oy Not a new thing…  J. W. Tukey. Exploratory data analysis. Addison- Wesley, Reading,MA, 1977:  “with exploratory data analysis the researcher explores the data in many possible ways, including the use of graphical tools like boxplots or histograms, gaining knowledge from the way data are displayed”
  • 28. Copyright © Miracle Finland Oy Old problems and new problems  More and more data (volume)  Different data models and formats (variety)  Loading in progress while data exploration going on (velocity)  Not all data is reliable (veracity)  We do not know what we are looking for (value, viability, variability)  Must support also non-technical users (journalists, investors, politicians,…) (visualization)  All must be done efficiently and fast
  • 29. Copyright © Miracle Finland Oy Different Layers  the user interaction/user interface  the middleware  the database layer/engine
  • 30. Copyright © Miracle Finland Oy Challenges for the User Interaction  The user interface should help the user to find information she/he was not aware of and to find even more information based on the information already found.  The data is in different formats, it is a large amount of different kinds of data and analyzing it takes time.  The conversation must be fluently responsive.  Not all users are IT professionals  The query result visualization to understand the data and to communicate with the exploration software would be valuable
  • 31. Copyright © Miracle Finland Oy User Interaction, Query Visualization  The role of visualization for data analytics is huge (a picture tells more than 1000 words)  visualization tools that assist users in navigating the underlying data structures  tools that incorporate new types of interactions such as collaborative annotations and searches  Query result reduction (aggregation, sampling and/or filtering operations)  Query result visualization
  • 32. Copyright © Miracle Finland Oy User Interaction, User Involvement  the relevance of an answer for a specific user  predict users’ interests and anticipations in order to issue the most relevant answer  Some possible techniques  Grouping the users (subgroups)  classify different notions of relevance  devise strategies to customize the system response to the user behavior
  • 33. Copyright © Miracle Finland Oy Middleware  Building various optimizations on top of the database layer/engine  Can be used to improve the efficiency of the data exploration without changing the underlying architecture  The exploration tasks are computationally heavy and can be speedup in the middleware layer with  Data Prefetching (and background execution)  Query Approximation
  • 34. Copyright © Miracle Finland Oy Middleware, Data Prefetching  caching data sets which are likely to be used  the trade-off between introducing new results and re-using cached ones  the main challenge is identifying the data set with the highest utility
  • 35. Copyright © Miracle Finland Oy Middleware, Query Approximation  The system offers approximate answers  Allows users to get a quick sense of whether a particular query reveals anything interesting about the data  Need solutions that process queries on sampled data sets to provide fast query response times  the trade-off between results accuracy and query performance
  • 36. Copyright © Miracle Finland Oy Database Engine/Layer, Solutions 1. Adaptive Indexing 2. Adaptive Loading 3. Adaptive Storage 4. Flexible Architecture 5. Architectures tailored for approximation processing
  • 37. Copyright © Miracle Finland Oy Adaptive Indexing  creating indexes incrementally and adaptively during query processing based on the columns, tables and value ranges that queries request  Indexes are built gradually; as more queries arrive indexes are continuously fine-tuned
  • 38. Copyright © Miracle Finland Oy Adaptive Loading  Not all data is needed; users might want to leave some parts of the data unloaded  Not all data is available; users might start querying a database system before all data is loaded
  • 39. Copyright © Miracle Finland Oy Adaptive Storage  There is no one perfect solution for storage layout for all the data.  Adaptive Storage can be used for different needs of storage layouts  Examples  OctopusDB (a central log and Storage Views)  H2O (supports multiple storage layouts and data access patterns, decides during query processing, which design is best for classes of queries)  In-memory columnar stores (for example Oracle)  real-time materialized views (Oracle 12c rel 2)
  • 40. Copyright © Miracle Finland Oy Flexible Architecture  can tune the architecture to the task at hand  by having a declarative interface for the physical data layouts  by having a declarative interface for the whole engine  using organic databases to continuously match incoming data and queries  Start with a schema that is ”good-enough”, schema evolution while loading and querying
  • 41. Copyright © Miracle Finland Oy Architectures tailored for approximation processing  An architecture where storage and access patterns are tailored to efficiently support sampling-based query processing  efficient access and updates of sampled data sets  In the DB core  For example Oracle Database 12c: Adaptive Dynamic Sampling (for the optimizer)  DICE (Distributed and Interactive Cube Exploration) faceted exploration of data cubes; combines sampling with speculative execution in the same engine  Design new engines is one of the open challenges.
  • 42. Copyright © Miracle Finland Oy Techniques for Large Data Size  Summarizing: fully pre-computing is not possible and not smart  The size of data summarized must be large enough to estimate statistical parameters and distributions, but manageable from the computation viewpoint  Data mining techniques  Histograms, presenting a histogram algebra for efficiently executing queries on the histograms instead of on the data, and estimating the quality of the approximate answers  Effective pruning techniques, not all node-paths are interesting from the user viewpoint, and the ones that fail to satisfy this requirement should be discarded as soon as possible.
  • 43. Copyright © Miracle Finland Oy  Two possibilities:  Polyglot persistence  Multimodel database Different Data Models
  • 44. Copyright © Miracle Finland Oy  Different kinds of data models are best dealt with different data stores  Data is stored using multiple data storage technologies to support better the usage of the data (applications, components) Polyglot Persistence
  • 45. Copyright © Miracle Finland Oy  Even using Oracle technology you can build a polyglot solution:  Berkeley DB as a Key-Value store  Oracle NoSQL Database as a Key-Value and sharded database  OracleTimesTen as an In-Memory Database  Essbase for analytic processing  … Oracle and Polyglot Persistence
  • 46. Copyright © Miracle Finland Oy  Different query languages  Different frontends/user interfaces  How to join data?  Is data consistents?  Is data secured?  Many different skills needed  … What’s wrong with polyglot
  • 47. Copyright © Miracle Finland Oy Polyglot vs. Multimodel
  • 48. Copyright © Miracle Finland Oy  A multi-model database is designed to support multiple data models against a single, integrated backend  One query language  All data available  ACID (or any level needed), security, backup/recovery, …  All the ”good” from relational added with all the ”good” from everywhere else Multi-model database
  • 49. Copyright © Miracle Finland Oy How to save and how to retrive the data?
  • 50. Copyright © Miracle Finland Oy Database Engine/Layer  A database is all about saving and retrieving data  The way the data is stored defines the best possible ways of accessing it, the way data is accessed defines the best way to store it…  We might not know what data we will be loading  We do not know what we are looking for -> therefore are unable to define the ”best” way of storing  how can the architecture of database systems be redesigned to aid data exploration tasks?
  • 51. Copyright © Miracle Finland Oy  Efficiently, Reliably  Understanding the data while saving  finding the PK to be able to join to other data  Taxonomies to understand that first name = fname?  Metadata Management  …  Categorizing  Some data is more valuable than other  some data is more reliable than other  not all the data want to be stored in the same, operational database  data classification (reliable/no reliable) Saving
  • 52. Copyright © Miracle Finland Oy  The data need to be classified before storing and the qualification might affect where and on which format the data is stored.  Maybe transforming?  Usually the disk space in RDBMS is expensive  -> new ways to store the data (Hadoop,…)  What data to save close to what and why  todays spam is tomorrows ham (possibility to re- categorize)  How to handle date datatype?  … Saving
  • 53. Copyright © Miracle Finland Oy  Quite often the solution for saving a new data model is CLOB, BLOB or external table  But we need more than that… Oracle Database
  • 54. Copyright © Miracle Finland Oy Interesting solutions in Oracle Database
  • 55. Copyright © Miracle Finland Oy  save as it is, use it with SQL  Now only JSON, something more in the future?  “write without schema, read with schema” JSON DataGuide in Oracle Database 12cR2
  • 56. Copyright © Miracle Finland Oy  Enabling FSDM (Flexible Schema Data Management) in a RDBMS  Storing JSON objects as aggregated, self-described entities without shredding them into relational rows and columns  Indexing JSON using a schema-agnostic strategy to support ad-hoc queries that search both schema and values together  Querying JSON using SQL as the inter-document query language and SQL/JSON path language as the intra- document query language JSON DataGuide in Oracle Database 12cR2 and on
  • 57. Copyright © Miracle Finland Oy  DataGuide information is part of the JSON search index infrastructure  CREATE SEARCH INDEX FOR JSON  Can be created for index only, DataGuide only, both (the default is both)  database privilege CTXAPP needed  The content of DataGuide is automatically updated whenever the index is synchronized.  To obtain the DataGuide information stored in a JSON search index use PL/SQL function DBMS_JSON.get_index_dataguide Creating a DataGuide, Persistent
  • 58. Copyright © Miracle Finland Oy  When the index is synced, additive updates to the document set are automatically reflected in the persisted DataGuide information  Deletions are not updated  you must create the index again if the data of deletion is needed  can also include statistics (how frequently each JSON field is used in the document set)  Explicitly gathered (EXEC DBMS_STATS.gather_index_stats(username, 'json_doc_search_idx', estimate_percent => 90);)  Not updated automatically JSON search index and Data Guide
  • 59. Copyright © Miracle Finland Oy  There are two formats for a DataGuide (both defined as CLOB):  Flat  to query DataGuide information such as field frequencies and types  Hierarchical  to add virtual columns, or to create a view using particular fields that you choose on the basis of DataGuide information  (DBMS.JSON.FORMAT_FLAT or DBMS.JSON.FORMAT_HIERARCHICAL) Formats of DataGuide
  • 60. Copyright © Miracle Finland Oy  Based on DataGuide information for a JSON column, you can project scalar fields as virtual columns to the same table that contains this JSON column  From  hierarchical DataGuide or  DataGuide enabled JSON search index  DBMS_JSON.add_virtual_columns , DBMS_JSON.drop_virtual_columns, DBMS_JSON.rename_column  can decide which columns should be projected based on the frequency of the elements in the documents stored in the JSON column (frequency => ) DataGuide and a Virtual Column
  • 61. Copyright © Miracle Finland Oy  To improve the performance you can  Build an index on it  Gather statistics on it for the optimizer  Load it into the In-Memory Column Store DataGuide and a Virtual Column
  • 62. Copyright © Miracle Finland Oy  Relational views on top of a set of JSON files for querying the data as it were relational  still only saved once and as a JSON document  You can create multiple views based on the same JSON document set, projecting different fields JSON DataGuide, views
  • 63. Copyright © Miracle Finland Oy  Based on a Hierarchical DataGuide  The fields projected are those in the DataGuide  You can edit the DataGuide to include only the fields that you want to project.  DBMS_JSON.create_view, DBMS_JSON. create_view_on_path (create the view based on the frequency of the elements)  Based on a Path Expression  You can use the information in a DataGuide enabled JSON search index to create a database view  Columns project JSON fields from your documents  The fields projected are the scalar fields and the scalar fields in the data targeted by a specified SQL/JSON path expression. Creating views over JSON data
  • 64. Copyright © Miracle Finland Oy CREATE TABLE json_documents ( id RAW(16) NOT NULL, data CLOB, CONSTRAINT json_documents_pk PRIMARY KEY (id), CONSTRAINT json_documents_json_chk CHECK (data IS JSON) ); Creating views over JSON data, the table
  • 65. Copyright © Miracle Finland Oy BEGIN DBMS_JSON.create_view_on_path( viewname => 'json_documents_view', tablename => 'json_documents', jcolname => 'data', path => '$', frequency => 50); END; / Creating views over JSON data
  • 66. Copyright © Miracle Finland Oy DESC json_documents_view Name Null? Type ----------------------------------------- -------- ---------------------------- ID NOT NULL RAW(16) FIRST_NAME VARCHAR2(30) LAST_NAME VARCHAR2(30) STREET VARCHAR2(50) POSTCODE VARCHAR2(10) CITY VARCHAR2(30) COUNTRY VARCHAR2(100) EMAIL VARCHAR2(100) PHONE VARCHAR2(20) Creating views over JSON data, the view
  • 67. Copyright © Miracle Finland Oy  It enables FSDM (Flexible Schema Data Management) in a RDBMS  And as a solution might be used in other cases for different semi-structured data models in the future… Why DataGuide is interesting?
  • 68. Copyright © Miracle Finland Oy  Save as it is, copy to a faster data model kept in- memory  Now only columnar store In-Memory
  • 69. Copyright © Miracle Finland Oy  Very interesting for retrieving data…  Saving data as its original data model and showing a copy of it on an in-memory version of another data model when needed  Original data as it is and a copy of it created in another data model when needed  Fast (relative term ☺ )  Synchronized automatically  No additional storage needed  The version of a data model is used that works best for the need  Maybe other data models could be stored in-memory too? Graph? In-Memory
  • 70. Copyright © Miracle Finland Oy  So many data models  The network data model  The hierarchical data model  The relational data model  Objects  XML  Spatial  NoSQL models  …  Flexible Schema Data (FSD): ”data first, schema later/never” Conclusion
  • 71. Copyright © Miracle Finland Oy  Several V’s related to Big Data…  Volume  Velocity  Variety  Veracity  Viability  Value  Variability  Visualization  … Conclusion
  • 72. Copyright © Miracle Finland Oy Conclusion  Data exploration is about finding new information and knowledge from the data without knowing what you are looking for. And it brings a lot of pressure for the performance…in all levels:  the user interaction/user interface  the middleware  the database layer/engine
  • 73. Copyright © Miracle Finland Oy Conclusion  Various data models, solutions  Polyglot  Multimodel database  Promising solutions in Oracle Database  JSON DataGuide in Oracle Database 12cR2  In-Memory  Plenty of research going on in this area and results needed soon…
  • 74. THANK YOU! QUESTIONS? Email: heli@miracleoy.fi Twitter: @HeliFromFinland Blog: Helifromfinland.com