Mow2012 data services

Data services in a Service Oriented
Architecture

Syed M Shaaf
Red Hat

@sshaaf
20th April 2012

Problem: Data Challenges

Tremendous value in existing information assets, but...

Time consuming and costly to implement new applications
that leverage this information

Challenges
 Different physical structure
 Different terminology and meaning
Data Gap
 Different interfaces
 May need to federate/integrate
 May be “locked in” to database
 Must ensure performance
Operational Packaged
 Maintain/Improve security Data Warehouse
Data Stores Applications

EDS – Open source solution

EDS is a data virtualization
system that allows applications
to use data from multiple,
heterogenous data stores.

What is EDS?
● EDS is an open source solution for scalable
information integration through a relational
abstraction.
● EDS focuses on:
● Real-time integration performance
● Feature-full integration via SQL/Procedures/XQuery
● Providing JDBC access
● EDS enables:
● Data Services / SOA
● Legacy / JPA integration

Turn the data you have into the
information you need

Query Plan

● Parsing
● Resolving
● Validating
● Rewriting
● Logical plan optimization -
● Processing plan conversion -

Query Plan
SELECT e.title, e.lastname FROM Employees AS e JOIN
Departments AS d ON e.dept_id = d.dept_id WHERE year(e.birthday) >= 1970
AND d.dept_name = 'Engineering'

Optimizations

Access patterns for handling criteria
and pushdown leverages the source
database

Optimizations

Making use of Dependant joins and
optional joins

Cursors and batching

By default all results are cursored and
all results are in a batch. Set the
processor batch size, connector
batch size or fetchsize via jdbc

Buffer Management

Buffers are stored in memory and/or
on disk

Processing

Joins are done by default as merge-
sort and sorting algorithm is multi
pass merge-sort

Handling Load
● Memory Usage – the BufferManager acts as a memory
manager for batches (with passivation) to ensure that
memory will not be exhausted.
● Non-blocking source queries – rather than waiting for
source query results processor thread detach from the
plan and pick up a plan that has work.
● Time slicing – plans produce batches for a time slice
before re-queuing and allowing their thread to do other
work (preemptive control only between batches)
● Caching – ResultSets, processing plans, internal
materialized views, etc.

More on Caching

● See the caching guide and
http://community.jboss.org/wiki/AHowToGuideForMaterializationcachingViewsInTeiid

● Admins can primarily control prepared plan and result
set caching. Procedure plans are also automatically
cached in the plan cache.
● Scoping of cache entries is determined automatically
● Internal materialization leverages EDS temp tables,
which are in turn backed by the buffer manager.
● Canonical value caching is dynamically used to cut
down on the memory profile – can be disabled.
● Internal caching of metadata at various levels.

Transactions
● Three scopes
● Global (through XAResource)
● Local (autocommit = false)
● Command (autocommit = true)
● All scopes are handled by JBoss Transactions
JTA
● Command scope behavior is handled through
txnAutoWrap={ON|OFF|DETECT}
● Isolation level is set on a per connector basis.

Where to find it

● http://access.redhat.com

Mow2012 data services

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Mow2012 data services

Similar to Mow2012 data services (20)

More from Syed Shaaf

More from Syed Shaaf (6)

Mow2012 data services