SlideShare a Scribd company logo
1 of 44
Welcome                                                                 NoSQL for Data Services,
                                                                                Data Virtualization & Big
                                                                                Data
                                                                                Guido Schmutz
                                                                                25.9.2012




BASEL   BERN   LAUSANNE     ZÜRICH         DÜSSELDORF          FRANKFURT A.M.   FREIBURG I.BR.   HAMBURG   MÜNCHEN   STUTTGART   WIEN




                      2012 © Trivadis
 1
                      NoSQL for Data Services, Data Virtualization & Big Data
                      25.9.2012
Guido Schmutz

•   Working for Trivadis for more than 15 years
•   Oracle ACE Director for Fusion Middleware and SOA
•   Co-Author of different books
•   Consultant, Trainer Software Architect for Java, Oracle, SOA
    and EDA
•   Member of Trivadis Architecture Board
•   Technology Manager @ Trivadis


•   More than 20 years of software development
    experience


•   Contact: guido.schmutz@trivadis.com
•   Blog: http://guidoschmutz.wordpress.com
•   Twitter: gschmutz

                2012 © Trivadis
2
                NoSQL for Data Services, Data Virtualization & Big Data
                25.9.2012
Agenda


1. Why NoSQL and what is it?
2. NoSQL Database Types
3. Polyglot Persistence
4. Data Virtualization Layer
5. Summary




              2012 © Trivadis
3
              NoSQL for Data Services, Data Virtualization & Big Data
              25.9.2012
History of Database


1960s            File-based, Network (CODASYL) and Hierarchical
Databases
1970s            Relational Database
1980             SQL became the standard query language
Early 1990       Object-Databases
Late 1990        XML Databases
2004             NoSQL Databases




             2012 © Trivadis
4
             NoSQL for Data Services, Data Virtualization & Big Data
             25.9.2012
What„s wrong with Relational Databases ? They are great ….


•   SQL provides a rich, declarative query language
•   Database enforce referential integrity
•   ACID semantics
•   Well understood by developers, database administrators
•   Well supported by different languages, frameworks and tools
    • Hibernate, JPA, JDBC, iBATIS, Entity Framework
•   Well understood and accepted by operations people (DBAs)
    •   Configuration
    •   Monitoring
    •   Backup and Recovery
    •   Tuning
    •   Design

                2012 © Trivadis
5
                NoSQL for Data Services, Data Virtualization & Big Data
                25.9.2012
Relational Databases are great ... But!
                                                                                                                        ORDER
                                                                           Order
                                                                            ID: 1001

Problem: Complex Object graphs
                                                                            Order Date: 15.9.2012

                                                                             Customer

     Object/Relational impedance mismatch
                                                                                                                        CUSTOMER
                                                                              First Name: Peter
                                                                              Last Name: Sample


     Complicated to map rich domain model
                                                                               Billing Address
                                                                                   Street: Somestreet 10

      to relational schema
                                                                                   City: Somewhere
                                                                                   Postal Code: 55901                   ADDRESS



     Performance issues                                                     Line Items


         Many rows in many tables                                            Name

                                                                              Ipod Touch
                                                                                                 Quantity

                                                                                                           1
                                                                                                                Price

                                                                                                               220.95
                                                                                                                        ORDER_LINES



         Many joins                                                          Monster Beat                 2   190.00


         Eager vs. lazy loading                                              Apple Mouse                  1    69.90




Problem: Schema evolution
     Adding attributes to an object => have to add columns to table
     Expensive, if lots of data in that table
      - Holding locks on the tables for long time
      - Application downtime …

                 2012 © Trivadis
6
                 NoSQL for Data Services, Data Virtualization & Big Data
                 25.9.2012
ORDER
                                                                      Order



Relational Databases are great ... But!
                                                                       ID: 1001
                                                                       Order Date: 15.9.2012

                                                                        Customer
                                                                                                                   CUSTOMER
                                                                         First Name: Peter
                                                                         Last Name: Sample
                                                                          Billing Address
                                                                              Street: Somestreet 10
                                                                              City: Somewhere
                                                                              Postal Code: 55901                   ADDRESS



                                                                        Line Items
                                                                         Name               Quantity       Price   ORDER_LINES

                                                                         Ipod Touch                   1   220.95

                                                                         Monster Beat                 2   190.00

                                                                         Apple Mouse                  1    69.90




                                                                              Consumer


                                                                                          REST/SOAP



                                                                              Service


                                                                              Repository/DAO


                                                                              O/R Mapping



                                                                                                SQL




                                                                              RDBMS




            2012 © Trivadis
7
            NoSQL for Data Services, Data Virtualization & Big Data
            25.9.2012
Relational Databases are great ... But!


Problem: Semi-structured data
     Relational schema doesn„t easily handle semi-structured data
     Common solutions
      - Name/Value table
        - Poor performance
        - Lack of constraint
      - Serialize as Blob
        - Fewer joins, but no query capabilities
Problem: Scaling
     Scaling writes difficult/expensive/impossible => BigData
     Vertical scaling is limited and is expensive
     Horizontal scaling is limited and is expensive



                 2012 © Trivadis
8
                 NoSQL for Data Services, Data Virtualization & Big Data
                 25.9.2012
Solution: NoSQL ?


No standard definition of what NoSQL means
•   Not Only SQL
Term began in a workshop organized in 2009
but some common characteristics of NoSQL databases
•   They don„t use the relational data model and thus don„t use SQL
•   Tend to be designed to run on cluster
                                                                                                  RDBMS                                NoSQL
•   Tend to be Open Source                                           Presentation
                                                                         Tier
                                                                                                 User Interface                       User Interface




•   Schema-Less - Don„t have a fixed                                                                                                                         Key Value Stores


    schema, allowing to store any




                                                                                                                                                                                Services
                                                                                                                                Caching             Search
                                                                      Middle Tier    Object-Relational    Relational-Object                                      Lucene
                                                                                                                              Transactions          Batch


    data in any record
                                                                                                                                                               MapReduce




•   Different APIs
                                                                                           Search             Blobs


                                                                     Database Tier      Transactions          Batch                          Data


                                                                                          Caching            Triggers




                 2012 © Trivadis
9
                 NoSQL for Data Services, Data Virtualization & Big Data
                 25.9.2012
Central vs. Application Databases

Central Database                                                                   Application Database
•    Using SQL as the integration mechanism                                        •   Only accessed by a single application
     between applications
                                                                                   •   Only the application using the database
•    applications store data in common DB                                              needs to know about the structure
•    Improves communication, all applications                                      •   Easier to maintain and evolve the schema
     operate on consistent set of data
                                                                                   •   More freedom to choose the database
•    Structure ends up to be more complex
                                                                                   •   Applicable to SOA (i.e. Data Service/Entity
•    Changes need to be coordinated with all                                           Service) with good Service Autonomy
     other applications using the database
                                                                                   •   Ready for the cloud
•    Side-effects (i.e. adding database index)

     Application 1   Application 2         Application 3                                Application 1   Application 2   Application 3




                         DB                                                                 DB              DB               DB


                         2012 © Trivadis
10
                         NoSQL for Data Services, Data Virtualization & Big Data
                         25.9.2012
Relational vs. Aggregate Data Models


 The relational model takes the                                             Aggregate is a term that comes
  information and divides it into                                             from Domain-Driven Design
  tuples (rows)                                                               (Evans)
 A tuple is a limited data structure                                        An aggregate is a collection of
      no nesting of tuples                                                   related objects, that should be
      no list of values                                                      treated as a unit
                                                                               Unit for data manipulation and
                                                                                management of consistency




                  2012 © Trivadis
11
                  NoSQL for Data Services, Data Virtualization & Big Data
                  25.9.2012
Relational vs. Aggregate Data Model


Relational Instance                                                               Aggregate Instance
CUSTOMER                                     PRODUCT
 ID    NAME                                   ID       NAME                       {
  1    Guido                                 1000    IPod Touch                   „id“:1,
                                             1020   Monster Beat                  „name“:“Guido“,
 BILLING_ADDRESS                                                                  „billingAddress“:[{„street“:“Chaumontweg“,“city“:“Spiegel“,“postCode“:“3095“}]
                                                                                  }
 ID      CUSTOMER_ID        ADDRESS_ID
  1           1                 55                                                {
                                                                                  „id“:90,
      ADDRESS                                                                     „customerId“:1,
 ID         STREET               CITY               POST_CODE                     „orderItems“:[
                                                                                    {
 55       Chaumontweg           Spiegel                3095
                                                                                    „productId“:1000,“price“: 250.55, „produtName“: „iPod Touch“
                                                                                    },
 ORDER                                                                              {
 ID    CUSTOMER_ID         SHIPPING_ADDRESS_ID                                      „productId“:1020,“price“: 199.55, „produtName“: „Monster Beat“
 90         1                       55                                              }],
                                                                                  „sippingAddress“:[{„street“:“Chaumontweg“,“city“:“Spiegel“,“postCode“:“3095“}]
                                                                                  }
ORDER_ITEM
 ID       ORDER_ID           PRODUCT_ID                PRICE
  1          90                 1000                   250.55
  1          90                 1020                   199.55




                        2012 © Trivadis
12
                        NoSQL for Data Services, Data Virtualization & Big Data
                        25.9.2012
Agenda


1. Why NoSQL and what is it?
2. NoSQL Database Types
3. Polyglot Persistence
4. Data Virtualization Layer
5. Summary




              2012 © Trivadis
15
              NoSQL for Data Services, Data Virtualization & Big Data
              25.9.2012
NoSQL Database Types
              Key/Value                              Column Family          Document          Graph
 Key/Value Stores
Design            Collections of  Columns and                               Key/Value pairs   Focus on the
    Ordered Key-Value Stores Colum Families.
                  Key/Value Pairs                                           but value is      connections
                                  Acesses directly                          interpreted by    between data and
    Big Table Stores (map-of-maps-of-maps)
                                  the colum values.                         the database      the fast navigation

   Document Stores
Scalability/ +++                                    +++                    ++                ++
Performance
  Graph Databases
Aggregate-  Yes                                     Yes                    Yes               No
oriented
Complexity         +                                 ++                     ++                +++
Inspiration and    Berkley DB,                       SAP Sybase IQ,         Lotus Notes       Graph Theory
Relation           Memcached,                        BigTable
                   Distributed
                   Hashmaps

NoSQL              Voldemort                         Hbase                  CouchDB           Sones
Products           Redis                             Cassandra              MongoDB           Neo4J
                   Riak                              Hypertable             OrientDB          InfoGrid
                                                     Amazon SimpleDB        RavenDB           FlockDB
                  2012 © Trivadis
16
                  NoSQL for Data Services, Data Virtualization & Big Data
                  25.9.2012
NoSQL Database Types
Size




       Key-value stores


                                        Column Family

                                                                          Document
                                                                                     Graph




                Relational



                                                                                      Complexity

                2012 © Trivadis
17
                NoSQL for Data Services, Data Virtualization & Big Data
                25.9.2012
Key Value Databases

 A key-value store is a simple hash table
 Primarily used when all access to the database is via primary key
 Simplest NoSQL data stores to use (from an API perspective)
            PUT, GET, DELETE (matches REST)

 Value is a blob with the data store not caring or knowing what is inside
 Aggregate-Oriented


Suitable Use Cases
•        Storing Session Information
•        User Profiles, Preferences
•        Shopping Cart Data


                        2012 © Trivadis
    18
                        NoSQL for Data Services, Data Virtualization & Big Data
                        25.9.2012
Column-Family Stores

 Store data in column families as rows that have many columns associated with a
  row key
 Column families are groups of related data, often accessed together
 Aggregate-Oriented




Suitable Use Cases
•        Event Logging
•        Content Management Systems
•        Counters
                                                                                   Source: NoSQL Distilled
•        Expiring Usage
                         2012 © Trivadis
    19
                         NoSQL for Data Services, Data Virtualization & Big Data
                         25.9.2012
Document Databases

 Documents are the main concept
 Stores and retrieves documents, which can be XML, JSON, BSON, …
 Documents are self-describing, hierarchical tree data structures which can consist
  of maps, collections and scalar values
 Documents stored are similar to each other but do not have to be exactly the
  same
 Aggregate-Oriented

Suitable Use Cases
•        Event Logging
•        Content Management Systems
•        Web Analytics or Real-Time Analytics
•        Product Catalog
                         2012 © Trivadis
    20
                         NoSQL for Data Services, Data Virtualization & Big Data
                         25.9.2012
Document Database - MongoDB




         2012 © Trivadis
21
         NoSQL for Data Services, Data Virtualization & Big Data
         25.9.2012
Graph Databases

 Allow to store entities and relationships between these entities
 Entities are known as nodes, which have properties
 Relations are known as edges, which also have properties
 A query on the graph is also known as traversing the graph
 Traversing the relationships is very fast

                                                                                                                                            Tag
                                                                                                              Customer


Suitable Use Cases
                                                                                Country
                                                                                                                         RATED            TAG



•        Connected Data
                                                                                                    ADDRESS
                                                                                   COUNTRY                                            Product


•        Routing, Dispatch and Location-Based                                                                 BILLING_
                                                                                                                          LINE_ITEM
         Services
                                                                                                              ADDRESS
                                                                                          Address


         Recommendation Engines
                                                                                                                         Order
•                                                                                                       DELIVERY_
                                                                                                        ADDRESS




                      2012 © Trivadis
    22
                      NoSQL for Data Services, Data Virtualization & Big Data
                      25.9.2012
Graph Database – Neo4J




Query through Cypher
 START MATCH WHERE RETURN ORDER BY LIMIT
 customer=node:Customer(email = "david@dmband.com")
 customer-[:ORDERED]->order-[item:LINEITEM]->product
 order.date > 20120101
 product.name, sum(item.amount) AS product
 products DESC 20

                     2012 © Trivadis
 23
                     NoSQL for Data Services, Data Virtualization & Big Data
                     25.9.2012
Agenda


1. Why NoSQL and what is it?
2. NoSQL Database Types
3. Polyglot Persistence
4. Data Virtualization Layer
5. Summary




              2012 © Trivadis
24
              NoSQL for Data Services, Data Virtualization & Big Data
              25.9.2012
Polyglot Persistence


In 2006, Neal Ford coined the term Polyglot Programming
 Applications should be written in a mix of languages to take advantage
  of the fact that different languages are suitable for tackling different
  problems
Polyglot Persistence defines a a hybrid approach to persistence
 Using multiple data storage technologies
 Selected based on the way data is being used by individual applications
      Why store binary images in relational databases, when there are better
       storage systems?
 Can occur both over the enterprise as well as within a single application



                 2012 © Trivadis
25
                 NoSQL for Data Services, Data Virtualization & Big Data
                 25.9.2012
„Traditional“ Persistence Model
Polyglot Persistence                                                                                         E-commerce Application




Today we use the same
database for all kind of data                                        Shopping cart data      User Sessions        Completed Order     Product Catalog   Recomendations




•    Business transactions, session
     management data, reporting,                                                                                    RDBMS


     logging information, content
     information, ...
No need for same properties of                                                                    Polygot Persistence Model
availability, consistency or                                                                                 E-commerce Application

backup requirements
Polyglot Data Storage Usage
allows to mix and match                                              Shopping cart data      User Sessions        Completed Order     Product Catalog   Recomendations




Relational and NoSQL data
stores
                                                                                      Key-Value                     RDMBS              Document            Graph




                 2012 © Trivadis
26
                 NoSQL for Data Services, Data Virtualization & Big Data
                 25.9.2012
Polyglot Persistence – Challenges

• Decisions
         • Have to decide what data storage technology to use
         • Today it„s easier to go with relational
• New Data Access APIs
     •     Each data store has its own mechanisms for accessing the data
     •     Different API‟s
                                                                                                    Service-Oriented Polygot Persistence Model
                                                                                                                         E-commerce Application

     •     Solution: Wrap the data access
           code into services (Data/Entity
           Service) exposed to applications
     •     Will enforce a contract/schema
                                                                               Shopping cart data        User Sessions        Completed Order      Product Catalog    Recomendations




           to a schemaless database
                                                                                 Key-Value                                                                               Graph
                                                                                                                                RDMBS               Document

                                                                               Shopping Cart            User Session                              Product Catalog    Recomendation
                                                                                 Service                  Service            Order Service            Service           Service




                     2012 © Trivadis
27
                     NoSQL for Data Services, Data Virtualization & Big Data
                     25.9.2012
Polyglot Persistence – Challenges


•    Immaturity
     • NoSQL tools are still young, full of rough edges that new tools have
     • Not much experience, we don„t know how to use them well
     • No patterns and best practices exist yet


•    Organizational Change
     • How will the different data groups in an enterprise react to this new
       technology


•    Dealing with eventual consistency paradigm
     • Reaction of different stakeholders to the fact that data could be stale
     • How to enforce rules to sync data across systems

                  2012 © Trivadis
28
                  NoSQL for Data Services, Data Virtualization & Big Data
                  25.9.2012
Agenda


1. What is NoSQL and Big Data
2. NoSQL Database Types
3. Polyglot Persistence
4. Data Virtualization Layer
5. Summary




              2012 © Trivadis
29
              NoSQL for Data Services, Data Virtualization & Big Data
              25.9.2012
Data Access Architecture for Polyglot Persistence


well known design patterns are still valid!
some best practices we know in data access are still valid!

     Consumer                                Consumer                            Consumer


           REST/SOAP                                                                   REST/SOAP



     Service                                                                     Service

                                                           REST
     Repository/DAO                                                              Repository/DAO


     O/R Mapping

                                                                                            ???
               SQL

                                             REST API

     RDBMS                                   NoSQL                               NoSQL



                       2012 © Trivadis
30
                       NoSQL for Data Services, Data Virtualization & Big Data
                       25.9.2012
Middle Tier Architecture for Polyglot Persistence




                                                                                                                                                  Resource Tier
                                                                                 Middle Tier
     Consumer                   Integration           Service                   Application                        Domain           Integration

                                                                                                              Domain Service Bean




                                                        Web Service Exporter




                                                                                   Application Service Bean
                                   REST
        Composite Application




                                                                                                                Factory Bean

                                  SOAP
                                                                                                                                    O/R Mapping
                                                                                                                Domain Objects
                                                                                                                                    NoSQL API
                                                                                                                 Repository Bean
                                                                                                                    Aggregate        SQL API


                                                                                                                  DAO Bean


                                                                               Data Transfer Objects (DTO)




                                    2012 © Trivadis
31
                                    NoSQL for Data Services, Data Virtualization & Big Data
                                    25.9.2012
Polyglot Persistence with Spring Data


makes it easier to build Spring-powered applications that use new data
access technologies
provide improved support for relational database technologies
Commons project supports Polyglot Persistence
Currently support for:
•    JPA and JDBC (relational)                                             Consumer


•    Apache Hadoop                                                               REST/SOAP



•    GemFire                                                               Service


•    REST                                                                  Repository/DAO


•    Redis
•    MongoDB                                                                          ???



•    Neo4J
•    Hbase                                                                 NoSQL




                 2012 © Trivadis
32
                 NoSQL for Data Services, Data Virtualization & Big Data
                 25.9.2012
Spring Data – Mapping to Relational Database (using JPA)
                                                                     Annotations define the mapping:
                                                                     @Entity, @Id, @Column,
                                                                     @OneToOne, @OneToMany,
                                                                     @JoinColumn,


                                                                                             Consumer


                                                                                                   REST/SOAP



                                                                                             Service


                                                                                             Repository/DAO




                                                                                                        ???




                                                                                             NoSQL




           2012 © Trivadis
33
           NoSQL for Data Services, Data Virtualization & Big Data
           25.9.2012
Spring Data – Mapping to Relational Database
                                                                                                                         Consumer


                                                                                                                               REST/SOAP



                                                                                                                         Service


public interface CustomerRepository extends Repository<Customer, Long> {                                                 Repository/DAO


   Customer findByEmailAddress(EmailAddress emailAddress);
}                                                                                                                                   ???




@Repository                                                                                                              NoSQL


@Profile(“jpa")
class JpaCustomerRepository implements CustomerRepository {
   @Override
   public Customer findByEmailAddress(EmailAddress emailAddress) {

     TypedQuery<Customer> query = em.createQuery(
               "select c from Customer c where c.emailAddress = :email“, Customer.class);
     query.setParameter("email", emailAddress);
                                                              Customer guido= repository.findByEmailAddress(new
     return query.getSingleResult();                          EmailAddress(“guido@hotmail.com"));
}
                                                                                Customer anotherCust= new Customer(“Peter", “Sample");
<jpa:repositories base-package="com.oreilly.springdata.jpa" />                  anotherCust.setEmailAddress(guido.getEmailAddress());

                                                            repository.save(anotherCust);
<bean class="org.springframework.orm.jpa.JpaTransactionManager">
    <property name="entityManagerFactory" ref="entityManagerFactory" />
</bean>

<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
   <property name="dataSource" ref="dataSource" />
   <property name="packagesToScan" value="com.oreilly.springdata.jpa" />
</bean>

                      2012 © Trivadis
34
                      NoSQL for Data Services, Data Virtualization & Big Data
                      25.9.2012
Spring Data – Mapping to MongoDB
                                                                    Annotations define the mapping:
                                                                    @Document, @Id, @Indexed,
                                                                    @PersistenceConstructor,
                                                                    @CompoundIndex, @DBRef,
                                                                    @GeoSpatialIndex, @Value


                                                                                            Consumer


                                                                                                  REST/SOAP



                                                                                            Service


                                                                                            Repository/DAO




                                                                                                       ???




                                                                                            NoSQL




          2012 © Trivadis
35
          NoSQL for Data Services, Data Virtualization & Big Data
          25.9.2012
Spring Data – Generic Repositories for MongoDB
                                                                                                    Consumer


                                                                                                          REST/SOAP



                                                                                                    Service


public interface CustomerRepository extends Repository<Customer, Long> {                            Repository/DAO


   Customer findByEmailAddress(EmailAddress emailAddress);
}                                                                                                              ???




@Repository                                                                                         NoSQL


@Profile("mongodb")
class MongoDbCustomerRepository implements CustomerRepository {
   @Override
   public Customer findByEmailAddress(EmailAddress emailAddress) {

            Query query = query(where("emailAddress").is(emailAddress));
            return operations.findOne(query, Customer.class);
     }

<mongo:db-factory id="mongoDbFactory" dbname="e-store" />

<mongo:mapping-converter id="mongoConverter" base-package="com.oreilly.springdata.mongodb">
<mongo:custom-converters base-package="com.oreilly.springdata.mongodb" />
</mongo:mapping-converter>

<bean id="mongoTemplate" class="org.springframework.data.mongodb.core.MongoTemplate">
                                                          Customer guido= repository.findByEmailAddress(new
 <constructor-arg ref="mongoDbFactory" />                 EmailAddress(“guido@hotmail.com"));
 <constructor-arg ref="mongoConverter" />
 <property name="writeConcern" value="SAFE" />            Customer anotherCust= new Customer(“Peter", “Sample");
</bean>                                                   anotherCust.setEmailAddress(guido.getEmailAddress());
<mongo:repositories base-package="com.oreilly.springdata.mongodb" />
                                                           repository.save(anotherCust);
                     2012 © Trivadis
36
                     NoSQL for Data Services, Data Virtualization & Big Data
                     25.9.2012
Spring Data – Mapping to Neo4J
                                                                                                    Annotations define the mapping:
                                                                                                    @NodeEntity, RelationShipEntity,
                                                                                                    @GraphId, @RelatedTo,
                                                                                                    @RelatedToVia, @EndNode, @Fetch,




                                                                 Tag
                               Customer
 Country
                                              RATED            TAG


                     ADDRESS
     COUNTRY                                               Product


                               BILLING_
                               ADDRESS         LINE_ITEM
           Address

                                              Order
                         DELIVERY_
                         ADDRESS


                                          2012 © Trivadis
37
                                          NoSQL for Data Services, Data Virtualization & Big Data
                                          25.9.2012
Spring Data – Generic Repositories for MongoDB
                                                                                                      Consumer


                                                                                                            REST/SOAP



                                                                                                      Service


                                                                                                      Repository/DAO




public interface CustomerRepository extends GraphRepository<Customer> {
                                                                                                                 ???




   Customer findByEmailAddress(EmailAddress emailAddress);
}
                                                                                                      NoSQL




<neo4j:config graphDatabaseService="graphDatabaseService" />
<neo4j:repositories base-package="com.oreilly.springdata.neo4j" />

<bean id="graphDatabaseService" class="org.neo4j.kernel.EmbeddedGraphDatabase" destroy-method="shutdown">
  <constructor-arg value="target/graph.db" />
</bean>



Customer guido= repository.findByEmailAddress(new
EmailAddress(“guido@hotmail.com"));

Customer anotherCust= new Customer(“Peter", “Sample");
anotherCust.setEmailAddress(guido.getEmailAddress());

repository.save(anotherCust);




                      2012 © Trivadis
38
                      NoSQL for Data Services, Data Virtualization & Big Data
                      25.9.2012
Expose contract-first Web service
                                                                        Consumer


                                                                              REST/SOAP



                                                                        Service


                                                                        Repository/DAO


Use any Java Web Service Framework which supports
Contract-First approach                                                            ???




Can be SOAP or can be REST                                              NoSQL



Maps the data contract to the schemaless database
Uses the different Repository implementations
Must handle data migration issues together with the
Repository




              2012 © Trivadis
39
              NoSQL for Data Services, Data Virtualization & Big Data
              25.9.2012
Schemaless – We still have to migrate the data!


With RDMBS we are used to keep DDL scripts together                           Customer


with DML scripts for each single data model change
                                                                              Name: Peter Sample
                                                                              First Name: Peter
                                                                              Last Name: Sample
                                                                               BillingAddress
                                                                               Billing Address

     •   Has to be in sync with the data access code                            Street: Somestreet 10   Version 1.0
                                                                                City: Somewhere
                                                                                Postal Code:55901
                                                                                PostalCode: 55901

RDBMS has to be changed before the application
is changed => possible application downtime                                   Customer
     •   This is what the schemaless approach of most NoSQL                   Name: Peter Sample
                                                                              FirstName: Peter
         DB tries to avoid                                                    LastName: Sample
                                                                                                        Transition
                                                                               Billing Address
                                                                                                        Version 1.0 => 2.0

Schemaless DBs still need careful migration, due to
                                                                                Street: Somestreet 10
                                                                                City: Somewhere

implicit schema in any data access code
                                                                                PostalCode: 55901




But a more “on-demand” approach is possible                                   Customer

     •   Code can read data in a way that it tolerant to                      First Name: Peter
                                                                              Last Name: Sample

         changes in the data‟s implicit schema and migrate                     Billing Address          Version 2.0

         the data on the next update
                                                                                Street: Somestreet 10
                                                                                City: Somewhere
                                                                                PostalCode: 55901

     •   Similar to service versioning => gradual change


                    2012 © Trivadis
40
                    NoSQL for Data Services, Data Virtualization & Big Data
                    25.9.2012
Agenda


1. What is NoSQL and Big Data
2. NoSQL Database Types
3. Polyglot Persistence
4. Data Virtualization Layer and Data Architecture
5. Summary




              2012 © Trivadis
41
              NoSQL for Data Services, Data Virtualization & Big Data
              25.9.2012
Pros & Cons of NoSQL compared to RDBMS


Pros                                                                   Cons
• No O/R impedance mismatch                                            • Lacks in tool and framework
                                                                         support
• Can easily evolve schemas
                                                                       • Few other implementations =>
• Can represent semi-structured
                                                                         potential lock in
  info
                                                                       • No support for ad-hoc queries
• Can represent graphs/networks
  (with performance)                                                   • Another/A new database in
                                                                         production to take care of




             2012 © Trivadis
42
             NoSQL for Data Services, Data Virtualization & Big Data
             25.9.2012
Summary


Relational databases are here to stay but NoSQL offers new persistence
model
Polyglot Persistence will be the future
Schemaless does not mean there is no data migration! => but a more on-
demand model might be possible
Encapsulate data access code to be able to switch databases
Service-orientation provides the data contract to a NoSQL database => to
make information reusable
Don„t commit to a NoSQL until you have done a significant PoC
Make sure that Operations people (DBAs) are on board early enough
Non-relational is not new in an enterprise (OLTP vs. OLAP)

              2012 © Trivadis
43
              NoSQL for Data Services, Data Virtualization & Big Data
              25.9.2012
Possible Use Cases

 NoSQL for parallel ETL?
 NoSQL for modern BI
 NoSQL for stateful Middletier (i.e. shopping cart)
 NoSQL for aggregated master data (i.e. through REST for Web apps)
 NoSQL for a CMS-Store, directly accessible through REST
 NoSQL as a local Store for Mobile applications
 NoSQL for Event Sourcing and CQRS architectures




              2012 © Trivadis
44
              NoSQL for Data Services, Data Virtualization & Big Data
              25.9.2012
Further Information




           2012 © Trivadis
45
           NoSQL for Data Services, Data Virtualization & Big Data
           25.9.2012
VIELEN DANK.
                                                                                 Trivadis

                                                                                 Guido Schmutz
                                                                                 guido.schmutz@trivadis.com

                                                                                 info@trivadis.com
                                                                                 www.trivadis.com




BASEL    BERN   LAUSANNE     ZÜRICH         DÜSSELDORF          FRANKFURT A.M.   FREIBURG I.BR.   HAMBURG   MÜNCHEN   STUTTGART   WIEN




                       2012 © Trivadis
 46
                       NoSQL for Data Services, Data Virtualization & Big Data
                       25.9.2012

More Related Content

What's hot

UPA Arizona Presentation: Designing web content to engage customers and incre...
UPA Arizona Presentation: Designing web content to engage customers and incre...UPA Arizona Presentation: Designing web content to engage customers and incre...
UPA Arizona Presentation: Designing web content to engage customers and incre...Kath Straub
 
Meet Xo Core Presentation 2012
Meet Xo Core Presentation 2012Meet Xo Core Presentation 2012
Meet Xo Core Presentation 2012crancer
 
E-commerce Technology for Safe money transaction over the net
E-commerce Technology for Safe money transaction over the netE-commerce Technology for Safe money transaction over the net
E-commerce Technology for Safe money transaction over the netRaman K. Attri
 

What's hot (8)

UPA Arizona Presentation: Designing web content to engage customers and incre...
UPA Arizona Presentation: Designing web content to engage customers and incre...UPA Arizona Presentation: Designing web content to engage customers and incre...
UPA Arizona Presentation: Designing web content to engage customers and incre...
 
Next generation MDM
Next generation MDMNext generation MDM
Next generation MDM
 
Meet XO
Meet XOMeet XO
Meet XO
 
Meet Xo Core Presentation 2012
Meet Xo Core Presentation 2012Meet Xo Core Presentation 2012
Meet Xo Core Presentation 2012
 
Meet XO
Meet XOMeet XO
Meet XO
 
Lolland kommune
Lolland kommuneLolland kommune
Lolland kommune
 
1to1 messenger 26
1to1 messenger 261to1 messenger 26
1to1 messenger 26
 
E-commerce Technology for Safe money transaction over the net
E-commerce Technology for Safe money transaction over the netE-commerce Technology for Safe money transaction over the net
E-commerce Technology for Safe money transaction over the net
 

Viewers also liked

Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoJusto Hidalgo
 
Unix Automation using centralized configuration management tool
Unix Automation using centralized configuration management toolUnix Automation using centralized configuration management tool
Unix Automation using centralized configuration management toolTorrid Networks Private Limited
 
Hitchhiker's Guide to Open Source Cloud Computing
Hitchhiker's Guide to Open Source Cloud ComputingHitchhiker's Guide to Open Source Cloud Computing
Hitchhiker's Guide to Open Source Cloud ComputingMark Hinkle
 
Combine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quicklCombine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quicklNeo4j
 
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESBData Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESBDenodo
 
Writing and testing high frequency trading engines in java
Writing and testing high frequency trading engines in javaWriting and testing high frequency trading engines in java
Writing and testing high frequency trading engines in javaPeter Lawrey
 
Access control attacks by nor liyana binti azman
Access control attacks by nor liyana binti azmanAccess control attacks by nor liyana binti azman
Access control attacks by nor liyana binti azmanHafiza Abas
 
Debs 2011 tutorial on non functional properties of event processing
Debs 2011 tutorial  on non functional properties of event processingDebs 2011 tutorial  on non functional properties of event processing
Debs 2011 tutorial on non functional properties of event processingOpher Etzion
 
Comparative Analysis of Personal Firewalls
Comparative Analysis of Personal FirewallsComparative Analysis of Personal Firewalls
Comparative Analysis of Personal FirewallsAndrej Šimko
 
Installing Complex Event Processing On Linux
Installing Complex Event Processing On LinuxInstalling Complex Event Processing On Linux
Installing Complex Event Processing On LinuxOsama Mustafa
 
Reactconf 2014 - Event Stream Processing
Reactconf 2014 - Event Stream ProcessingReactconf 2014 - Event Stream Processing
Reactconf 2014 - Event Stream ProcessingAndy Piper
 
Tutorial in DEBS 2008 - Event Processing Patterns
Tutorial in DEBS 2008 - Event Processing PatternsTutorial in DEBS 2008 - Event Processing Patterns
Tutorial in DEBS 2008 - Event Processing PatternsOpher Etzion
 
Complex Event Processing with Esper and WSO2 ESB
Complex Event Processing with Esper and WSO2 ESBComplex Event Processing with Esper and WSO2 ESB
Complex Event Processing with Esper and WSO2 ESBPrabath Siriwardena
 
Chapter 12
Chapter 12Chapter 12
Chapter 12cclay3
 
Ceh v8 labs module 03 scanning networks
Ceh v8 labs module 03 scanning networksCeh v8 labs module 03 scanning networks
Ceh v8 labs module 03 scanning networksAsep Sopyan
 
CyberLab CCEH Session - 3 Scanning Networks
CyberLab CCEH Session - 3 Scanning NetworksCyberLab CCEH Session - 3 Scanning Networks
CyberLab CCEH Session - 3 Scanning NetworksCyberLab
 

Viewers also liked (20)

Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by Denodo
 
Unix Automation using centralized configuration management tool
Unix Automation using centralized configuration management toolUnix Automation using centralized configuration management tool
Unix Automation using centralized configuration management tool
 
Hitchhiker's Guide to Open Source Cloud Computing
Hitchhiker's Guide to Open Source Cloud ComputingHitchhiker's Guide to Open Source Cloud Computing
Hitchhiker's Guide to Open Source Cloud Computing
 
NoSQL Seminer
NoSQL SeminerNoSQL Seminer
NoSQL Seminer
 
NoSql
NoSqlNoSql
NoSql
 
Combine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quicklCombine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quickl
 
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESBData Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
 
Writing and testing high frequency trading engines in java
Writing and testing high frequency trading engines in javaWriting and testing high frequency trading engines in java
Writing and testing high frequency trading engines in java
 
Access control attacks by nor liyana binti azman
Access control attacks by nor liyana binti azmanAccess control attacks by nor liyana binti azman
Access control attacks by nor liyana binti azman
 
Debs 2011 tutorial on non functional properties of event processing
Debs 2011 tutorial  on non functional properties of event processingDebs 2011 tutorial  on non functional properties of event processing
Debs 2011 tutorial on non functional properties of event processing
 
Comparative Analysis of Personal Firewalls
Comparative Analysis of Personal FirewallsComparative Analysis of Personal Firewalls
Comparative Analysis of Personal Firewalls
 
Installing Complex Event Processing On Linux
Installing Complex Event Processing On LinuxInstalling Complex Event Processing On Linux
Installing Complex Event Processing On Linux
 
Reactconf 2014 - Event Stream Processing
Reactconf 2014 - Event Stream ProcessingReactconf 2014 - Event Stream Processing
Reactconf 2014 - Event Stream Processing
 
Session hijacking
Session hijackingSession hijacking
Session hijacking
 
Tutorial in DEBS 2008 - Event Processing Patterns
Tutorial in DEBS 2008 - Event Processing PatternsTutorial in DEBS 2008 - Event Processing Patterns
Tutorial in DEBS 2008 - Event Processing Patterns
 
Complex Event Processing with Esper and WSO2 ESB
Complex Event Processing with Esper and WSO2 ESBComplex Event Processing with Esper and WSO2 ESB
Complex Event Processing with Esper and WSO2 ESB
 
Chapter 12
Chapter 12Chapter 12
Chapter 12
 
Ceh v8 labs module 03 scanning networks
Ceh v8 labs module 03 scanning networksCeh v8 labs module 03 scanning networks
Ceh v8 labs module 03 scanning networks
 
CyberLab CCEH Session - 3 Scanning Networks
CyberLab CCEH Session - 3 Scanning NetworksCyberLab CCEH Session - 3 Scanning Networks
CyberLab CCEH Session - 3 Scanning Networks
 
Nmap scripting engine
Nmap scripting engineNmap scripting engine
Nmap scripting engine
 

Similar to NoSQL for Data Services, Data Virtualization & Big Data

Enabling Supplier Communities
Enabling Supplier CommunitiesEnabling Supplier Communities
Enabling Supplier Communitiesbillwoerner
 
Ebs architecture con9036_pdf_9036_0001
Ebs architecture con9036_pdf_9036_0001Ebs architecture con9036_pdf_9036_0001
Ebs architecture con9036_pdf_9036_0001jucaab
 
The Modern Web Part 4: Cloud Computing
The Modern Web Part 4: Cloud ComputingThe Modern Web Part 4: Cloud Computing
The Modern Web Part 4: Cloud ComputingDavid Pallmann
 
PromptCloud Nasscom Emerge 50 Presentation
PromptCloud Nasscom Emerge 50 PresentationPromptCloud Nasscom Emerge 50 Presentation
PromptCloud Nasscom Emerge 50 PresentationPromptCloud
 
Architecting Cloud Solutions
Architecting Cloud SolutionsArchitecting Cloud Solutions
Architecting Cloud SolutionsAMD
 
Logical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business OutcomesLogical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business OutcomesDenodo
 
Employing Enterprise Application Integration (EAI)
Employing Enterprise Application Integration (EAI)Employing Enterprise Application Integration (EAI)
Employing Enterprise Application Integration (EAI)elliando dias
 
DDN Accelerating-Decisions-Through-Enterprise-Hadoop-final
DDN Accelerating-Decisions-Through-Enterprise-Hadoop-finalDDN Accelerating-Decisions-Through-Enterprise-Hadoop-final
DDN Accelerating-Decisions-Through-Enterprise-Hadoop-finalIntelHealthcare
 
Kevin jackson cloud service brokerage for datacenter service providers for we...
Kevin jackson cloud service brokerage for datacenter service providers for we...Kevin jackson cloud service brokerage for datacenter service providers for we...
Kevin jackson cloud service brokerage for datacenter service providers for we...GovCloud Network
 
Expert Panel The Future of NoSQL Databases
Expert Panel The Future of NoSQL DatabasesExpert Panel The Future of NoSQL Databases
Expert Panel The Future of NoSQL DatabasesScyllaDB
 
Big Data i CSC's optik, CSC Representative
Big Data i CSC's optik, CSC RepresentativeBig Data i CSC's optik, CSC Representative
Big Data i CSC's optik, CSC RepresentativeIBM Danmark
 
Tear down this wall PESGB
Tear down this wall PESGBTear down this wall PESGB
Tear down this wall PESGBDavid Lloyd
 
DYN MassTLC go-to-market strategy
DYN MassTLC go-to-market strategyDYN MassTLC go-to-market strategy
DYN MassTLC go-to-market strategyMassTLC
 
Scaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data DistributionScaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data DistributionScaleBase
 
Tech Talk SQL Server 2012 Business Intelligence
Tech Talk SQL Server 2012 Business IntelligenceTech Talk SQL Server 2012 Business Intelligence
Tech Talk SQL Server 2012 Business IntelligenceRay Cochrane
 
Brac Delta Housing Finance Limited
Brac Delta Housing Finance LimitedBrac Delta Housing Finance Limited
Brac Delta Housing Finance LimitedTahmidur Rahman
 
Tera stream for datastreams
Tera stream for datastreamsTera stream for datastreams
Tera stream for datastreams치민 최
 
SISO LSA AND OMG DDS
SISO LSA AND OMG DDSSISO LSA AND OMG DDS
SISO LSA AND OMG DDSSimware
 

Similar to NoSQL for Data Services, Data Virtualization & Big Data (20)

Enabling Supplier Communities
Enabling Supplier CommunitiesEnabling Supplier Communities
Enabling Supplier Communities
 
Ebs architecture con9036_pdf_9036_0001
Ebs architecture con9036_pdf_9036_0001Ebs architecture con9036_pdf_9036_0001
Ebs architecture con9036_pdf_9036_0001
 
The Modern Web Part 4: Cloud Computing
The Modern Web Part 4: Cloud ComputingThe Modern Web Part 4: Cloud Computing
The Modern Web Part 4: Cloud Computing
 
PromptCloud Nasscom Emerge 50 Presentation
PromptCloud Nasscom Emerge 50 PresentationPromptCloud Nasscom Emerge 50 Presentation
PromptCloud Nasscom Emerge 50 Presentation
 
Architecting Cloud Solutions
Architecting Cloud SolutionsArchitecting Cloud Solutions
Architecting Cloud Solutions
 
Logical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business OutcomesLogical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business Outcomes
 
Employing Enterprise Application Integration (EAI)
Employing Enterprise Application Integration (EAI)Employing Enterprise Application Integration (EAI)
Employing Enterprise Application Integration (EAI)
 
DDN Accelerating-Decisions-Through-Enterprise-Hadoop-final
DDN Accelerating-Decisions-Through-Enterprise-Hadoop-finalDDN Accelerating-Decisions-Through-Enterprise-Hadoop-final
DDN Accelerating-Decisions-Through-Enterprise-Hadoop-final
 
Kevin jackson cloud service brokerage for datacenter service providers for we...
Kevin jackson cloud service brokerage for datacenter service providers for we...Kevin jackson cloud service brokerage for datacenter service providers for we...
Kevin jackson cloud service brokerage for datacenter service providers for we...
 
Expert Panel The Future of NoSQL Databases
Expert Panel The Future of NoSQL DatabasesExpert Panel The Future of NoSQL Databases
Expert Panel The Future of NoSQL Databases
 
Big Data i CSC's optik, CSC Representative
Big Data i CSC's optik, CSC RepresentativeBig Data i CSC's optik, CSC Representative
Big Data i CSC's optik, CSC Representative
 
Tear down this wall PESGB
Tear down this wall PESGBTear down this wall PESGB
Tear down this wall PESGB
 
Power pointshow
Power pointshowPower pointshow
Power pointshow
 
DYN MassTLC go-to-market strategy
DYN MassTLC go-to-market strategyDYN MassTLC go-to-market strategy
DYN MassTLC go-to-market strategy
 
Scaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data DistributionScaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data Distribution
 
Tech Talk SQL Server 2012 Business Intelligence
Tech Talk SQL Server 2012 Business IntelligenceTech Talk SQL Server 2012 Business Intelligence
Tech Talk SQL Server 2012 Business Intelligence
 
Brac Delta Housing Finance Limited
Brac Delta Housing Finance LimitedBrac Delta Housing Finance Limited
Brac Delta Housing Finance Limited
 
Tera stream for datastreams
Tera stream for datastreamsTera stream for datastreams
Tera stream for datastreams
 
SISO LSA AND OMG DDS
SISO LSA AND OMG DDSSISO LSA AND OMG DDS
SISO LSA AND OMG DDS
 
ADO.NET Data Services
ADO.NET Data ServicesADO.NET Data Services
ADO.NET Data Services
 

More from Guido Schmutz

30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as CodeGuido Schmutz
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureGuido Schmutz
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsGuido Schmutz
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!Guido Schmutz
 
Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Guido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureEvent Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureGuido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureEvent Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureGuido Schmutz
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaGuido Schmutz
 
Location Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaLocation Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaGuido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaGuido Schmutz
 
What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
 
Location Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaLocation Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaGuido Schmutz
 
Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming VisualisationGuido Schmutz
 
Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Guido Schmutz
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaGuido Schmutz
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureGuido Schmutz
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Guido Schmutz
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 

More from Guido Schmutz (20)

30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data Architecture
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 
Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureEvent Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data Architecture
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureEvent Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache Kafka
 
Location Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaLocation Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache Kafka
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
 
What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
Location Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaLocation Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using Kafka
 
Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming Visualisation
 
Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI Architecture
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 

Recently uploaded

Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

NoSQL for Data Services, Data Virtualization & Big Data

  • 1. Welcome NoSQL for Data Services, Data Virtualization & Big Data Guido Schmutz 25.9.2012 BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN 2012 © Trivadis 1 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 2. Guido Schmutz • Working for Trivadis for more than 15 years • Oracle ACE Director for Fusion Middleware and SOA • Co-Author of different books • Consultant, Trainer Software Architect for Java, Oracle, SOA and EDA • Member of Trivadis Architecture Board • Technology Manager @ Trivadis • More than 20 years of software development experience • Contact: guido.schmutz@trivadis.com • Blog: http://guidoschmutz.wordpress.com • Twitter: gschmutz 2012 © Trivadis 2 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 3. Agenda 1. Why NoSQL and what is it? 2. NoSQL Database Types 3. Polyglot Persistence 4. Data Virtualization Layer 5. Summary 2012 © Trivadis 3 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 4. History of Database 1960s File-based, Network (CODASYL) and Hierarchical Databases 1970s Relational Database 1980 SQL became the standard query language Early 1990 Object-Databases Late 1990 XML Databases 2004 NoSQL Databases 2012 © Trivadis 4 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 5. What„s wrong with Relational Databases ? They are great …. • SQL provides a rich, declarative query language • Database enforce referential integrity • ACID semantics • Well understood by developers, database administrators • Well supported by different languages, frameworks and tools • Hibernate, JPA, JDBC, iBATIS, Entity Framework • Well understood and accepted by operations people (DBAs) • Configuration • Monitoring • Backup and Recovery • Tuning • Design 2012 © Trivadis 5 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 6. Relational Databases are great ... But! ORDER Order ID: 1001 Problem: Complex Object graphs Order Date: 15.9.2012 Customer  Object/Relational impedance mismatch CUSTOMER First Name: Peter Last Name: Sample  Complicated to map rich domain model Billing Address Street: Somestreet 10 to relational schema City: Somewhere Postal Code: 55901 ADDRESS  Performance issues Line Items  Many rows in many tables Name Ipod Touch Quantity 1 Price 220.95 ORDER_LINES  Many joins Monster Beat 2 190.00  Eager vs. lazy loading Apple Mouse 1 69.90 Problem: Schema evolution  Adding attributes to an object => have to add columns to table  Expensive, if lots of data in that table - Holding locks on the tables for long time - Application downtime … 2012 © Trivadis 6 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 7. ORDER Order Relational Databases are great ... But! ID: 1001 Order Date: 15.9.2012 Customer CUSTOMER First Name: Peter Last Name: Sample Billing Address Street: Somestreet 10 City: Somewhere Postal Code: 55901 ADDRESS Line Items Name Quantity Price ORDER_LINES Ipod Touch 1 220.95 Monster Beat 2 190.00 Apple Mouse 1 69.90 Consumer REST/SOAP Service Repository/DAO O/R Mapping SQL RDBMS 2012 © Trivadis 7 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 8. Relational Databases are great ... But! Problem: Semi-structured data  Relational schema doesn„t easily handle semi-structured data  Common solutions - Name/Value table - Poor performance - Lack of constraint - Serialize as Blob - Fewer joins, but no query capabilities Problem: Scaling  Scaling writes difficult/expensive/impossible => BigData  Vertical scaling is limited and is expensive  Horizontal scaling is limited and is expensive 2012 © Trivadis 8 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 9. Solution: NoSQL ? No standard definition of what NoSQL means • Not Only SQL Term began in a workshop organized in 2009 but some common characteristics of NoSQL databases • They don„t use the relational data model and thus don„t use SQL • Tend to be designed to run on cluster RDBMS NoSQL • Tend to be Open Source Presentation Tier User Interface User Interface • Schema-Less - Don„t have a fixed Key Value Stores schema, allowing to store any Services Caching Search Middle Tier Object-Relational Relational-Object Lucene Transactions Batch data in any record MapReduce • Different APIs Search Blobs Database Tier Transactions Batch Data Caching Triggers 2012 © Trivadis 9 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 10. Central vs. Application Databases Central Database Application Database • Using SQL as the integration mechanism • Only accessed by a single application between applications • Only the application using the database • applications store data in common DB needs to know about the structure • Improves communication, all applications • Easier to maintain and evolve the schema operate on consistent set of data • More freedom to choose the database • Structure ends up to be more complex • Applicable to SOA (i.e. Data Service/Entity • Changes need to be coordinated with all Service) with good Service Autonomy other applications using the database • Ready for the cloud • Side-effects (i.e. adding database index) Application 1 Application 2 Application 3 Application 1 Application 2 Application 3 DB DB DB DB 2012 © Trivadis 10 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 11. Relational vs. Aggregate Data Models  The relational model takes the  Aggregate is a term that comes information and divides it into from Domain-Driven Design tuples (rows) (Evans)  A tuple is a limited data structure  An aggregate is a collection of  no nesting of tuples related objects, that should be  no list of values treated as a unit  Unit for data manipulation and management of consistency 2012 © Trivadis 11 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 12. Relational vs. Aggregate Data Model Relational Instance Aggregate Instance CUSTOMER PRODUCT ID NAME ID NAME { 1 Guido 1000 IPod Touch „id“:1, 1020 Monster Beat „name“:“Guido“, BILLING_ADDRESS „billingAddress“:[{„street“:“Chaumontweg“,“city“:“Spiegel“,“postCode“:“3095“}] } ID CUSTOMER_ID ADDRESS_ID 1 1 55 { „id“:90, ADDRESS „customerId“:1, ID STREET CITY POST_CODE „orderItems“:[ { 55 Chaumontweg Spiegel 3095 „productId“:1000,“price“: 250.55, „produtName“: „iPod Touch“ }, ORDER { ID CUSTOMER_ID SHIPPING_ADDRESS_ID „productId“:1020,“price“: 199.55, „produtName“: „Monster Beat“ 90 1 55 }], „sippingAddress“:[{„street“:“Chaumontweg“,“city“:“Spiegel“,“postCode“:“3095“}] } ORDER_ITEM ID ORDER_ID PRODUCT_ID PRICE 1 90 1000 250.55 1 90 1020 199.55 2012 © Trivadis 12 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 13. Agenda 1. Why NoSQL and what is it? 2. NoSQL Database Types 3. Polyglot Persistence 4. Data Virtualization Layer 5. Summary 2012 © Trivadis 15 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 14. NoSQL Database Types Key/Value Column Family Document Graph  Key/Value Stores Design Collections of Columns and Key/Value pairs Focus on the  Ordered Key-Value Stores Colum Families. Key/Value Pairs but value is connections Acesses directly interpreted by between data and  Big Table Stores (map-of-maps-of-maps) the colum values. the database the fast navigation Document Stores Scalability/ +++ +++ ++ ++ Performance Graph Databases Aggregate- Yes Yes Yes No oriented Complexity + ++ ++ +++ Inspiration and Berkley DB, SAP Sybase IQ, Lotus Notes Graph Theory Relation Memcached, BigTable Distributed Hashmaps NoSQL Voldemort Hbase CouchDB Sones Products Redis Cassandra MongoDB Neo4J Riak Hypertable OrientDB InfoGrid Amazon SimpleDB RavenDB FlockDB 2012 © Trivadis 16 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 15. NoSQL Database Types Size Key-value stores Column Family Document Graph Relational Complexity 2012 © Trivadis 17 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 16. Key Value Databases  A key-value store is a simple hash table  Primarily used when all access to the database is via primary key  Simplest NoSQL data stores to use (from an API perspective)  PUT, GET, DELETE (matches REST)  Value is a blob with the data store not caring or knowing what is inside  Aggregate-Oriented Suitable Use Cases • Storing Session Information • User Profiles, Preferences • Shopping Cart Data 2012 © Trivadis 18 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 17. Column-Family Stores  Store data in column families as rows that have many columns associated with a row key  Column families are groups of related data, often accessed together  Aggregate-Oriented Suitable Use Cases • Event Logging • Content Management Systems • Counters Source: NoSQL Distilled • Expiring Usage 2012 © Trivadis 19 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 18. Document Databases  Documents are the main concept  Stores and retrieves documents, which can be XML, JSON, BSON, …  Documents are self-describing, hierarchical tree data structures which can consist of maps, collections and scalar values  Documents stored are similar to each other but do not have to be exactly the same  Aggregate-Oriented Suitable Use Cases • Event Logging • Content Management Systems • Web Analytics or Real-Time Analytics • Product Catalog 2012 © Trivadis 20 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 19. Document Database - MongoDB 2012 © Trivadis 21 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 20. Graph Databases  Allow to store entities and relationships between these entities  Entities are known as nodes, which have properties  Relations are known as edges, which also have properties  A query on the graph is also known as traversing the graph  Traversing the relationships is very fast Tag Customer Suitable Use Cases Country RATED TAG • Connected Data ADDRESS COUNTRY Product • Routing, Dispatch and Location-Based BILLING_ LINE_ITEM Services ADDRESS Address Recommendation Engines Order • DELIVERY_ ADDRESS 2012 © Trivadis 22 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 21. Graph Database – Neo4J Query through Cypher START MATCH WHERE RETURN ORDER BY LIMIT customer=node:Customer(email = "david@dmband.com") customer-[:ORDERED]->order-[item:LINEITEM]->product order.date > 20120101 product.name, sum(item.amount) AS product products DESC 20 2012 © Trivadis 23 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 22. Agenda 1. Why NoSQL and what is it? 2. NoSQL Database Types 3. Polyglot Persistence 4. Data Virtualization Layer 5. Summary 2012 © Trivadis 24 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 23. Polyglot Persistence In 2006, Neal Ford coined the term Polyglot Programming  Applications should be written in a mix of languages to take advantage of the fact that different languages are suitable for tackling different problems Polyglot Persistence defines a a hybrid approach to persistence  Using multiple data storage technologies  Selected based on the way data is being used by individual applications  Why store binary images in relational databases, when there are better storage systems?  Can occur both over the enterprise as well as within a single application 2012 © Trivadis 25 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 24. „Traditional“ Persistence Model Polyglot Persistence E-commerce Application Today we use the same database for all kind of data Shopping cart data User Sessions Completed Order Product Catalog Recomendations • Business transactions, session management data, reporting, RDBMS logging information, content information, ... No need for same properties of Polygot Persistence Model availability, consistency or E-commerce Application backup requirements Polyglot Data Storage Usage allows to mix and match Shopping cart data User Sessions Completed Order Product Catalog Recomendations Relational and NoSQL data stores Key-Value RDMBS Document Graph 2012 © Trivadis 26 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 25. Polyglot Persistence – Challenges • Decisions • Have to decide what data storage technology to use • Today it„s easier to go with relational • New Data Access APIs • Each data store has its own mechanisms for accessing the data • Different API‟s Service-Oriented Polygot Persistence Model E-commerce Application • Solution: Wrap the data access code into services (Data/Entity Service) exposed to applications • Will enforce a contract/schema Shopping cart data User Sessions Completed Order Product Catalog Recomendations to a schemaless database Key-Value Graph RDMBS Document Shopping Cart User Session Product Catalog Recomendation Service Service Order Service Service Service 2012 © Trivadis 27 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 26. Polyglot Persistence – Challenges • Immaturity • NoSQL tools are still young, full of rough edges that new tools have • Not much experience, we don„t know how to use them well • No patterns and best practices exist yet • Organizational Change • How will the different data groups in an enterprise react to this new technology • Dealing with eventual consistency paradigm • Reaction of different stakeholders to the fact that data could be stale • How to enforce rules to sync data across systems 2012 © Trivadis 28 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 27. Agenda 1. What is NoSQL and Big Data 2. NoSQL Database Types 3. Polyglot Persistence 4. Data Virtualization Layer 5. Summary 2012 © Trivadis 29 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 28. Data Access Architecture for Polyglot Persistence well known design patterns are still valid! some best practices we know in data access are still valid! Consumer Consumer Consumer REST/SOAP REST/SOAP Service Service REST Repository/DAO Repository/DAO O/R Mapping ??? SQL REST API RDBMS NoSQL NoSQL 2012 © Trivadis 30 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 29. Middle Tier Architecture for Polyglot Persistence Resource Tier Middle Tier Consumer Integration Service Application Domain Integration Domain Service Bean Web Service Exporter Application Service Bean REST Composite Application Factory Bean SOAP O/R Mapping Domain Objects NoSQL API Repository Bean Aggregate SQL API DAO Bean Data Transfer Objects (DTO) 2012 © Trivadis 31 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 30. Polyglot Persistence with Spring Data makes it easier to build Spring-powered applications that use new data access technologies provide improved support for relational database technologies Commons project supports Polyglot Persistence Currently support for: • JPA and JDBC (relational) Consumer • Apache Hadoop REST/SOAP • GemFire Service • REST Repository/DAO • Redis • MongoDB ??? • Neo4J • Hbase NoSQL 2012 © Trivadis 32 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 31. Spring Data – Mapping to Relational Database (using JPA) Annotations define the mapping: @Entity, @Id, @Column, @OneToOne, @OneToMany, @JoinColumn, Consumer REST/SOAP Service Repository/DAO ??? NoSQL 2012 © Trivadis 33 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 32. Spring Data – Mapping to Relational Database Consumer REST/SOAP Service public interface CustomerRepository extends Repository<Customer, Long> { Repository/DAO Customer findByEmailAddress(EmailAddress emailAddress); } ??? @Repository NoSQL @Profile(“jpa") class JpaCustomerRepository implements CustomerRepository { @Override public Customer findByEmailAddress(EmailAddress emailAddress) { TypedQuery<Customer> query = em.createQuery( "select c from Customer c where c.emailAddress = :email“, Customer.class); query.setParameter("email", emailAddress); Customer guido= repository.findByEmailAddress(new return query.getSingleResult(); EmailAddress(“guido@hotmail.com")); } Customer anotherCust= new Customer(“Peter", “Sample"); <jpa:repositories base-package="com.oreilly.springdata.jpa" /> anotherCust.setEmailAddress(guido.getEmailAddress()); repository.save(anotherCust); <bean class="org.springframework.orm.jpa.JpaTransactionManager"> <property name="entityManagerFactory" ref="entityManagerFactory" /> </bean> <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean"> <property name="dataSource" ref="dataSource" /> <property name="packagesToScan" value="com.oreilly.springdata.jpa" /> </bean> 2012 © Trivadis 34 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 33. Spring Data – Mapping to MongoDB Annotations define the mapping: @Document, @Id, @Indexed, @PersistenceConstructor, @CompoundIndex, @DBRef, @GeoSpatialIndex, @Value Consumer REST/SOAP Service Repository/DAO ??? NoSQL 2012 © Trivadis 35 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 34. Spring Data – Generic Repositories for MongoDB Consumer REST/SOAP Service public interface CustomerRepository extends Repository<Customer, Long> { Repository/DAO Customer findByEmailAddress(EmailAddress emailAddress); } ??? @Repository NoSQL @Profile("mongodb") class MongoDbCustomerRepository implements CustomerRepository { @Override public Customer findByEmailAddress(EmailAddress emailAddress) { Query query = query(where("emailAddress").is(emailAddress)); return operations.findOne(query, Customer.class); } <mongo:db-factory id="mongoDbFactory" dbname="e-store" /> <mongo:mapping-converter id="mongoConverter" base-package="com.oreilly.springdata.mongodb"> <mongo:custom-converters base-package="com.oreilly.springdata.mongodb" /> </mongo:mapping-converter> <bean id="mongoTemplate" class="org.springframework.data.mongodb.core.MongoTemplate"> Customer guido= repository.findByEmailAddress(new <constructor-arg ref="mongoDbFactory" /> EmailAddress(“guido@hotmail.com")); <constructor-arg ref="mongoConverter" /> <property name="writeConcern" value="SAFE" /> Customer anotherCust= new Customer(“Peter", “Sample"); </bean> anotherCust.setEmailAddress(guido.getEmailAddress()); <mongo:repositories base-package="com.oreilly.springdata.mongodb" /> repository.save(anotherCust); 2012 © Trivadis 36 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 35. Spring Data – Mapping to Neo4J Annotations define the mapping: @NodeEntity, RelationShipEntity, @GraphId, @RelatedTo, @RelatedToVia, @EndNode, @Fetch, Tag Customer Country RATED TAG ADDRESS COUNTRY Product BILLING_ ADDRESS LINE_ITEM Address Order DELIVERY_ ADDRESS 2012 © Trivadis 37 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 36. Spring Data – Generic Repositories for MongoDB Consumer REST/SOAP Service Repository/DAO public interface CustomerRepository extends GraphRepository<Customer> { ??? Customer findByEmailAddress(EmailAddress emailAddress); } NoSQL <neo4j:config graphDatabaseService="graphDatabaseService" /> <neo4j:repositories base-package="com.oreilly.springdata.neo4j" /> <bean id="graphDatabaseService" class="org.neo4j.kernel.EmbeddedGraphDatabase" destroy-method="shutdown"> <constructor-arg value="target/graph.db" /> </bean> Customer guido= repository.findByEmailAddress(new EmailAddress(“guido@hotmail.com")); Customer anotherCust= new Customer(“Peter", “Sample"); anotherCust.setEmailAddress(guido.getEmailAddress()); repository.save(anotherCust); 2012 © Trivadis 38 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 37. Expose contract-first Web service Consumer REST/SOAP Service Repository/DAO Use any Java Web Service Framework which supports Contract-First approach ??? Can be SOAP or can be REST NoSQL Maps the data contract to the schemaless database Uses the different Repository implementations Must handle data migration issues together with the Repository 2012 © Trivadis 39 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 38. Schemaless – We still have to migrate the data! With RDMBS we are used to keep DDL scripts together Customer with DML scripts for each single data model change Name: Peter Sample First Name: Peter Last Name: Sample BillingAddress Billing Address • Has to be in sync with the data access code Street: Somestreet 10 Version 1.0 City: Somewhere Postal Code:55901 PostalCode: 55901 RDBMS has to be changed before the application is changed => possible application downtime Customer • This is what the schemaless approach of most NoSQL Name: Peter Sample FirstName: Peter DB tries to avoid LastName: Sample Transition Billing Address Version 1.0 => 2.0 Schemaless DBs still need careful migration, due to Street: Somestreet 10 City: Somewhere implicit schema in any data access code PostalCode: 55901 But a more “on-demand” approach is possible Customer • Code can read data in a way that it tolerant to First Name: Peter Last Name: Sample changes in the data‟s implicit schema and migrate Billing Address Version 2.0 the data on the next update Street: Somestreet 10 City: Somewhere PostalCode: 55901 • Similar to service versioning => gradual change 2012 © Trivadis 40 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 39. Agenda 1. What is NoSQL and Big Data 2. NoSQL Database Types 3. Polyglot Persistence 4. Data Virtualization Layer and Data Architecture 5. Summary 2012 © Trivadis 41 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 40. Pros & Cons of NoSQL compared to RDBMS Pros Cons • No O/R impedance mismatch • Lacks in tool and framework support • Can easily evolve schemas • Few other implementations => • Can represent semi-structured potential lock in info • No support for ad-hoc queries • Can represent graphs/networks (with performance) • Another/A new database in production to take care of 2012 © Trivadis 42 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 41. Summary Relational databases are here to stay but NoSQL offers new persistence model Polyglot Persistence will be the future Schemaless does not mean there is no data migration! => but a more on- demand model might be possible Encapsulate data access code to be able to switch databases Service-orientation provides the data contract to a NoSQL database => to make information reusable Don„t commit to a NoSQL until you have done a significant PoC Make sure that Operations people (DBAs) are on board early enough Non-relational is not new in an enterprise (OLTP vs. OLAP) 2012 © Trivadis 43 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 42. Possible Use Cases  NoSQL for parallel ETL?  NoSQL for modern BI  NoSQL for stateful Middletier (i.e. shopping cart)  NoSQL for aggregated master data (i.e. through REST for Web apps)  NoSQL for a CMS-Store, directly accessible through REST  NoSQL as a local Store for Mobile applications  NoSQL for Event Sourcing and CQRS architectures 2012 © Trivadis 44 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 43. Further Information 2012 © Trivadis 45 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012
  • 44. VIELEN DANK. Trivadis Guido Schmutz guido.schmutz@trivadis.com info@trivadis.com www.trivadis.com BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN 2012 © Trivadis 46 NoSQL for Data Services, Data Virtualization & Big Data 25.9.2012