Mow2012 data services


Published on

MOW2012, presentation about data virtualization using teiid. technical deep dive

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Mow2012 data services

  1. 1. Data services in a Service OrientedArchitecture Syed M Shaaf Red Hat @sshaaf 20th April 2012
  2. 2. Problem: Data Challenges Tremendous value in existing information assets, but... Time consuming and costly to implement new applications that leverage this informationChallenges  Different physical structure  Different terminology and meaning Data Gap  Different interfaces  May need to federate/integrate  May be “locked in” to database  Must ensure performance Operational Packaged  Maintain/Improve security Data Warehouse Data Stores Applications
  3. 3. EDS – Open source solutionEDS is a data virtualization system that allows applications to use data from multiple, heterogenous data stores.
  4. 4. What is EDS?● EDS is an open source solution for scalable information integration through a relational abstraction.● EDS focuses on: ● Real-time integration performance ● Feature-full integration via SQL/Procedures/XQuery ● Providing JDBC access● EDS enables: ● Data Services / SOA ● Legacy / JPA integration
  5. 5. Turn the data you have into the information you need
  6. 6. Architecture
  7. 7. Architecture
  8. 8. Query Plan● Parsing● Resolving● Validating● Rewriting● Logical plan optimization -● Processing plan conversion -
  9. 9. Query Plan SELECT e.title, e.lastname FROM Employees AS e JOIN Departments AS d ON e.dept_id = d.dept_id WHERE year(e.birthday) >= 1970 AND d.dept_name = Engineering
  10. 10. Optimizations Access patterns for handling criteria and pushdown leverages the source database
  11. 11. Optimizations Making use of Dependant joins and optional joins
  12. 12. Cursors and batching By default all results are cursored and all results are in a batch. Set the processor batch size, connector batch size or fetchsize via jdbc
  13. 13. Buffer Management Buffers are stored in memory and/or on disk
  14. 14. Processing Joins are done by default as merge- sort and sorting algorithm is multi pass merge-sort
  15. 15. Handling Load● Memory Usage – the BufferManager acts as a memory manager for batches (with passivation) to ensure that memory will not be exhausted.● Non-blocking source queries – rather than waiting for source query results processor thread detach from the plan and pick up a plan that has work.● Time slicing – plans produce batches for a time slice before re-queuing and allowing their thread to do other work (preemptive control only between batches)● Caching – ResultSets, processing plans, internal materialized views, etc.
  16. 16. More on Caching● See the caching guide and● Admins can primarily control prepared plan and result set caching. Procedure plans are also automatically cached in the plan cache.● Scoping of cache entries is determined automatically● Internal materialization leverages EDS temp tables, which are in turn backed by the buffer manager.● Canonical value caching is dynamically used to cut down on the memory profile – can be disabled.● Internal caching of metadata at various levels.
  17. 17. Transactions● Three scopes ● Global (through XAResource) ● Local (autocommit = false) ● Command (autocommit = true)● All scopes are handled by JBoss Transactions JTA● Command scope behavior is handled through txnAutoWrap={ON|OFF|DETECT}● Isolation level is set on a per connector basis.
  18. 18. Demo
  19. 19. Where to find it●