• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Big Data and Data Virtualization
 

Big Data and Data Virtualization

on

  • 498 views

 

Statistics

Views

Total Views
498
Views on SlideShare
498
Embed Views
0

Actions

Likes
2
Downloads
39
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Today the collaboration between Red Hat and SAP continues. <br /> Engineers from both companies are working towards a common target — enhancing the interoperability of JBoss Enterprise middleware with the existing SAP landscape. Specifically, Red Hat and SAP are collaborating on development efforts for tools that are designed to simplify the integration of SAP data and business processes with other enterprise data and applications. <br /> The aim of such integration, of course, is a more intelligent enterprise — one that can maximize the value of your data assets in accelerating business decisions. <br />
  • <br />
  • To remember the pragmatic definition of big data, think SPA — the three questions of big data: <br /> Store. Can you capture and store the data? <br /> Process. Can you cleanse, enrich, and analyze the data?  <br /> Access. Can you retrieve, search, integrate, and visualize the data? <br /> <br />
  • Easy data accessibility thru standard interfaces e.g SQL, Web Services etc. <br /> Exposes non-relational sources as relational <br /> Read and write data in place <br /> Real time access <br /> No data replication/duplication required <br /> So lets define what are the attributes of Data Virtualization solution. The first thing that data virtualization product does is virtualizes the data, regardless of where it is. It makes the data look as if it was in one place. So applications don’t need to know where the data is, because the data virtualization software does that for you. <br /> The second thing that data virtualization does is federating the data. You’re running a query which spans multiple databases or data warehouses. You want that query to run sufficiently and with optimum performance. So in order to do that, you need a variety of techniques, like caching, like pushdown optimization, you need to have knowledge of the source databases to make this whole environment run as smoothly and efficiently as possible. <br /> Thirdly, it abstracts the data into the format of choice. It conforms the data so that it’s in a consistent format, and that’s regardless of the native structure or syntax of the data. And one point I should make here is that you want to be able to – you don’t want a tool which will force you to have a particular format. What you want is a format that suits your business, rather than one which is imposed on you. So you need to have, the data virtualization tool itself needs to be agile and flexible, in the sense of being able to provide a data format that suits you. <br /> And then the fourth thing you have a requirement for is to present the data in a consistent fashion. And it doesn’t matter whether it’s a business intelligence application, it’s a mash-up, it’s a regular application; whatever it is, you want to be able to present the data in a consistent format to the business, to participating applications. <br /> Imagine if all the up-to-date data you need to take informed action, is available to you on demand as one unified source. This is the capability provided by Data Virtualization software. <br /> <br />
  • Easy data accessibility thru standard interfaces e.g SQL, Web Services etc. <br /> Exposes non-relational sources as relational <br /> Read and write data in place <br /> Real time access <br /> No data replication/duplication required <br /> The data virtualization software provides 3 step process to connect data sources and data consumers: <br /> Connect: Fast Access to data from disparate systems (databases, files, services, applications, etc.) with disparate access method and storage models. <br /> Compose: Easily create reusable, unified common data model and virtual data views by combining and transforming data from multiple sources. <br /> Consume: Seamlessly exposing unified, virtual data model and views available in real-time through a variety of open standards data access methods to support different tools and applications. <br /> JBoss Data Virtualization software implements all three steps internally while isolating/hiding complexity of data access methods, transformation and data merge logic details from information consumers. <br /> This enables organization to acquire actionable, unified information when they want it and the way they want it; i.e. at the business speed. <br />
  • <br />
  • To remember the pragmatic definition of big data, think SPA — the three questions of big data: <br /> Store. Can you capture and store the data? <br /> Process. Can you cleanse, enrich, and analyze the data?  <br /> Access. Can you retrieve, search, integrate, and visualize the data? <br /> <br />