This document discusses data virtualization techniques to improve research. It was presented by Jack A Richter with authors Lela McFarland and Christine Bredfeldt. The presentation defines data virtualization, describes different data virtualization techniques like enterprise data warehouses, data marts, distributed database links and views, and diagrams a data virtualization system. It also briefly discusses cloud databases and thanks contributors from Kaiser Permanente and Teradata.
The document discusses the emergence of big data and new data architectures needed to handle large, diverse datasets. It notes that internet companies built their own data systems like Hadoop to process massive amounts of unstructured data across thousands of servers in a fault-tolerant, scalable way. These systems use a map-reduce programming model and distributed file systems like HDFS to store and process data in a parallel, distributed manner.
This document introduces SQL-H, which enables SQL analytics on Hadoop. It provides a primer on HCatalog and Aster, defines SQL-H, and provides examples of SQL-H usage. SQL-H allows direct access to HCatalog tables from within AsterDB, providing full SQL support and integration with BI tools on data stored in Hadoop. It performs reads from HCatalog in a distributed, native manner without using MapReduce.
Identity 2.0, Web services and SOA in Health CareOliver Pfaff
Buzzwords such as Identity 2.0, Web services and SOA characterize the architectures of novel IT-systems. Concerning these recent trends, the stake holders of eHealth systems might ask a number of questions including:
• Users: does that help us in providing a better care?
• Owners: how does it change the suite of applications and services we provide?
• Suppliers: what is the footprint on our software architecture?
This presentation will discuss the relevance of Identity 2.0, Web services and SOA for IT-systems in health-care. It will identify and assess the value that can be added through ideas and technologies behind these trends. Regarding the fundamental concept of identity, architectural blueprints for Web services and SOA-based eHealth systems will also be investigated.
The document describes several potential metadata use cases, including reporting/analytics, desktop accessibility of metadata definitions, and governance workflows. It provides examples of actors, system interactions, and sample data for each use case. The use cases are presented to demonstrate how they can address common challenges with metadata solutions projects.
Wallchart - Continuous Data Quality ProcessDavid Walker
This document outlines a continuous data quality process for data warehousing and ETL. It involves profiling and cleansing data from operational systems and staging areas. It also includes auditing results, applying new rules or manual fixes if needed, and using data quality tools to help data stewards maintain and improve input data quality over time.
IRJET- Data Analysis and Solution Prediction using Elasticsearch in Healt...IRJET Journal
This document discusses using Elasticsearch and OData services to analyze large log files from medical devices for remote health monitoring. OData allows RESTful APIs to access and consolidate data from different sources in JSON format. Elasticsearch indexes this data for fast, full-text searching. It can identify issues by searching log files and provide predictive solutions. Kibana visualizes indexed Elasticsearch data to facilitate analysis. The combination of these technologies efficiently manages large medical log data and identifies potential problems for prognostic solutions, improving health care services.
This presentation is part of the course "184.742 Advanced Services Engineering" at The Vienna University of Technology, in Winter Semester 2012. Check the course at: http://www.infosys.tuwien.ac.at/teaching/courses/ase/
A set of slides that summarize how MarkLogic is being used in the healthcare industry including case studies on M*Modal and Informatics Corporation of America
The document discusses the emergence of big data and new data architectures needed to handle large, diverse datasets. It notes that internet companies built their own data systems like Hadoop to process massive amounts of unstructured data across thousands of servers in a fault-tolerant, scalable way. These systems use a map-reduce programming model and distributed file systems like HDFS to store and process data in a parallel, distributed manner.
This document introduces SQL-H, which enables SQL analytics on Hadoop. It provides a primer on HCatalog and Aster, defines SQL-H, and provides examples of SQL-H usage. SQL-H allows direct access to HCatalog tables from within AsterDB, providing full SQL support and integration with BI tools on data stored in Hadoop. It performs reads from HCatalog in a distributed, native manner without using MapReduce.
Identity 2.0, Web services and SOA in Health CareOliver Pfaff
Buzzwords such as Identity 2.0, Web services and SOA characterize the architectures of novel IT-systems. Concerning these recent trends, the stake holders of eHealth systems might ask a number of questions including:
• Users: does that help us in providing a better care?
• Owners: how does it change the suite of applications and services we provide?
• Suppliers: what is the footprint on our software architecture?
This presentation will discuss the relevance of Identity 2.0, Web services and SOA for IT-systems in health-care. It will identify and assess the value that can be added through ideas and technologies behind these trends. Regarding the fundamental concept of identity, architectural blueprints for Web services and SOA-based eHealth systems will also be investigated.
The document describes several potential metadata use cases, including reporting/analytics, desktop accessibility of metadata definitions, and governance workflows. It provides examples of actors, system interactions, and sample data for each use case. The use cases are presented to demonstrate how they can address common challenges with metadata solutions projects.
Wallchart - Continuous Data Quality ProcessDavid Walker
This document outlines a continuous data quality process for data warehousing and ETL. It involves profiling and cleansing data from operational systems and staging areas. It also includes auditing results, applying new rules or manual fixes if needed, and using data quality tools to help data stewards maintain and improve input data quality over time.
IRJET- Data Analysis and Solution Prediction using Elasticsearch in Healt...IRJET Journal
This document discusses using Elasticsearch and OData services to analyze large log files from medical devices for remote health monitoring. OData allows RESTful APIs to access and consolidate data from different sources in JSON format. Elasticsearch indexes this data for fast, full-text searching. It can identify issues by searching log files and provide predictive solutions. Kibana visualizes indexed Elasticsearch data to facilitate analysis. The combination of these technologies efficiently manages large medical log data and identifies potential problems for prognostic solutions, improving health care services.
This presentation is part of the course "184.742 Advanced Services Engineering" at The Vienna University of Technology, in Winter Semester 2012. Check the course at: http://www.infosys.tuwien.ac.at/teaching/courses/ase/
A set of slides that summarize how MarkLogic is being used in the healthcare industry including case studies on M*Modal and Informatics Corporation of America
The METL process extracts meta data from a source system, transforms it to describe a different database structure, and loads it into a target system. This allows data to be accessed across systems with different structures without changing the source. Specifically, it extracts meta data from a legacy banking system, transforms it to work with a business intelligence tool, and loads it so the tool can query the legacy data through an SQL interface without knowing the source's structure. The METL process facilitates data sharing across different systems in a non-invasive way to prolong legacy systems' lives and provide access to production data.
This document provides an overview of Oracle Database 10g SQL Fundamentals I course. It discusses the goals of the course which are to understand Oracle 10g features, relational and object relational database concepts, and learn SQL skills. It also covers topics like Oracle 10g and Internet Platform, system development life cycle, data storage, relational database concepts, and entity relationship modeling.
Paperless registration: Higher satisfaction for staff and patientsABBYY
ABBYY FlexiCapture can streamline patient registration by eliminating manual data entry. It uses optical character recognition to extract data from paper documents like IDs, forms, and referrals during registration. This improves accuracy and availability of registration information while reducing costs and frustration compared to paper-based processes. It ensures compliance and allows staff to focus on patient care rather than tedious tasks.
Sophia Wang has extensive experience in business intelligence and data analytics using Microsoft SQL Server and related technologies. She has expertise in ETL processes using SSIS, data modeling, reporting with SSRS and SSAS, and working with relational databases. She has worked on projects involving building data warehouses, reporting solutions, and developing complex SQL queries, stored procedures and other database objects. Her skills include MS SQL Server, SSIS, SSRS, SSAS, TSQL, data visualization and analytics using tools like R and SAS.
In light of Cloud Computing System CDA Generation and Integration for Health ...IJAEMSJORNAL
Theoretical Successful sending of Electronic Health Record enhances tolerant security and nature of care, however it has the essential of interoperability between Health Information Exchange at various doctor's facilities. The Clinical Document Architecture (CDA) created by HL7 is a center record standard to guarantee such interoperability, and engendering of this archive configuration is basic for interoperability. Lamentably, clinics are hesitant to receive interoperable HIS because of its organization fetched with the exception of in a modest bunch nations. An issue emerges notwithstanding when more healing facilities begin utilizing the CDA archive arrange on the grounds that the information scattered in various reports are difficult to oversee. In this paper, we portray our CDA report era and incorporation Open API benefit in light of distributed computing, through which doctor's facilities are empowered to advantageously create CDA archives without purchasing restrictive programming. Our CDA archive combination framework incorporates various CDA records per tolerant into a solitary CDA report and doctors and patients can peruse the clinical information in sequential request. Our arrangement of CDA report era and joining depends on distributed computing and the administration is offered in Open API. Engineers utilizing distinctive stages along these lines can utilize our framework to improve interoperability.
IRJET- A Review on Secured CDA Generation Based on Cloud Computing SystemIRJET Journal
This document discusses a proposed system for generating and integrating Clinical Document Architecture (CDA) documents based on cloud computing. The key points are:
1. The system would allow hospitals to conveniently generate standard CDA documents without needing proprietary software, by utilizing an open API CDA generation service hosted on the cloud.
2. It would also integrate multiple CDA documents for a single patient from different hospitals into a single document, allowing medical personnel to easily view a patient's clinical history chronologically.
3. The proposed system is designed to enhance healthcare data interoperability and encourage greater CDA adoption among hospitals by offering the CDA generation and integration services through a free cloud-based platform.
Information and Integration Management VisionColin Bell
The vision of the Information and Integration Management team at the University of Waterloo captured on a single 'poster' page. Covers: Data Management Environment, Mission + Vision, Information Asset Base, Information Lifecycle, Document Management, Metadata/Meaning, Integration Platform, and Innovation Platform.
Creating Value Through Computable Clinical DataRobert Bond
Computable clinical data has numerous high value applications including comparative effectiveness research, personalized medicine, and claims adjudication. I discuss some of the technologies involved in making patient data computable
This document proposes an Intelligent Data Entry and Acquisition (IDEA) system to help with on-site highway maintenance and construction. It describes an architecture using wearable computers and sensors to collect asset data in the field, process it using pattern recognition, and upload it to centralized databases. Field workers could use tools like digital notepads, cameras, and GPS to gather location-tagged images, notes and condition reports on assets, which the IDEA system would then analyze and integrate into maintenance planning databases back at the office. The goal is to streamline data collection and improve safety, productivity and data quality for tasks like infrastructure inspections.
6. real time integration with odi 11g & golden gate 11g & dq 11g 20101103 -...Doina Draganescu
Oracle Data Integration provides high performance, real-time data integration across heterogeneous systems and data sources. It uses a declarative design approach to simplify and accelerate data integration projects. Oracle GoldenGate enables low-impact change data capture and real-time data replication between databases for real-time data warehousing. Using Oracle Data Integration and GoldenGate together provides a best-in-class solution for real-time data integration and warehousing.
The document discusses challenges with preserving data from databases for long-term use as records. It defines key concepts like records, metadata, databases, and data warehouses. The document recommends creating a policy to identify which records and metadata must be retained long-term. It also suggests using application layers to export identified records and metadata to a recordkeeping system like a data warehouse or ERMDS. This would help ensure usable records are available over time even after source systems are decommissioned.
Document Classification for Microsoft Officejoseph978
Titus Labs Document Classification is a classification and policy enforcement tool that ensures all Microsoft Office documents are classified before they can be saved, printed, or sent via email. Labels are fully customizable to meet internal and regulatory marking standards, and can be used to indicate any type of information, including data sensitivity (Public, Confidential) and department (HR, Finance).
IBM WebSphere DataStage 8.0 includes several new features and enhancements, including a new WebSphere Metadata Server Foundation that better integrates IBM Information Server products and supports enterprise metadata services. DataStage now provides graphical impact analysis and job difference capabilities directly in the designer. Data quality is now fully integrated through QualityStage stages. Usability has also been improved through enhanced search capabilities and easier export/import of metadata. Performance monitoring and system resource estimation has also been improved.
CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...HMO Research Network
The document describes the CER Hub, an informatics platform for conducting comparative effectiveness research using electronic medical record data. The CER Hub allows researchers to develop standardized processors to generate research datasets from heterogeneous EMR systems. It facilitates collaborative projects to address questions like evaluating asthma control and smoking cessation treatments. Initial projects through the CER Hub involve developing measures of asthma control and comparing the effectiveness of treatment intensification options using EMR data from six health systems.
This document discusses leveraging Oracle's engineered systems for deploying and running Oracle Identity Management. It introduces Oracle's engineered systems, including Oracle Exalogic and Exadata, which are designed to improve performance, scalability, and simplify deployment for mission-critical applications like Oracle Identity Management. The document also summarizes the benefits of using Oracle Exalogic and Exadata, such as reduced costs, risk, and ability to consolidate hundreds of servers into a single system. It provides examples of large customers that have achieved significant performance and scalability running Oracle Identity Management on Oracle's engineered systems.
DBArtisan XE6 is a database administration tool that helps DBAs manage databases across platforms more efficiently. It streamlines common tasks, reduces errors, and provides comprehensive capabilities for data management. As data volumes grow, the role of the DBA is evolving to handle multiple concurrent responsibilities. DBArtisan facilitates this role by providing performance monitoring, space and data management tools, and security management in a single interface.
Sharing of PHR on Cloud Using Attribute Based Encryption and Access by QR-CodeIRJET Journal
This document proposes a framework for securely sharing personal health records (PHRs) stored in the cloud. It leverages attribute-based encryption to encrypt PHRs before outsourcing them to cloud servers. Access to the encrypted PHRs is granted through attributes assigned to users, and a QR code linked to the patient. The framework aims to give patients control over their medical records while enabling flexible, fine-grained access for authorized users like healthcare providers. It addresses challenges around privacy, scalability, access management and revocation of access rights for PHRs stored remotely in the cloud.
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...cscpconf
In today’s world, gigantic amount of data is available in science, industry, business and many
other areas. This data can provide valuable information which can be used by management for
making important decisions. But problem is that how can find valuable information. The answer
is data mining. Data Mining is popular topic among researchers. There is lot of work that
cannot be explored till now. But, this paper focuses on the fundamental concept of the Data mining i.e. Classification Techniques. In this paper BayesNet, NavieBayes, NavieBayes Uptable, Multilayer perceptron, Voted perceptron and J48 classifiers are used for the classification of data set. The performance of these classifiers analyzed with the help of Mean Absolute Error, Root Mean-Squared Error and Time Taken to build the model and the result can be shown statistical as well as graphically. For this purpose the WEKA data mining tool is used.
Engineering the Healthcare Technology TransitionGreenway Health
The document discusses how healthcare organizations will adopt new technologies required by healthcare mandates, categorizing them as Pragmatists, Collaborators, or Innovators. It also explains the differences between dual database and single database solutions for integrating ambulatory workflows like financial, operational, and clinical systems. A single database solution allows real-time updates across workflows while dual databases require manual updates between separate systems.
This document discusses maximizing returns from a data warehouse. It covers the need for real-time data integration to power business intelligence and enable timely, trusted decisions. It outlines challenges with traditional batch-based approaches and how Oracle's data integration solutions address these through products that enable real-time data capture and delivery, bulk data movement, and data quality profiling to build an enterprise data warehouse.
Data resource management involves applying information systems technology to manage data resources. It includes activities like creating, storing, organizing, and retrieving data using database management systems. There are different types of databases like operational, distributed, data warehouses, data marts, and end user databases. Data warehouses store historical data from various operational databases to help identify trends. Data mining techniques are used to better understand data through analysis, sorting, extracting patterns and relationships to gain insights. Common applications of data mining include banking, customer relationship management, targeted marketing, fraud detection, and scientific data analysis.
The METL process extracts meta data from a source system, transforms it to describe a different database structure, and loads it into a target system. This allows data to be accessed across systems with different structures without changing the source. Specifically, it extracts meta data from a legacy banking system, transforms it to work with a business intelligence tool, and loads it so the tool can query the legacy data through an SQL interface without knowing the source's structure. The METL process facilitates data sharing across different systems in a non-invasive way to prolong legacy systems' lives and provide access to production data.
This document provides an overview of Oracle Database 10g SQL Fundamentals I course. It discusses the goals of the course which are to understand Oracle 10g features, relational and object relational database concepts, and learn SQL skills. It also covers topics like Oracle 10g and Internet Platform, system development life cycle, data storage, relational database concepts, and entity relationship modeling.
Paperless registration: Higher satisfaction for staff and patientsABBYY
ABBYY FlexiCapture can streamline patient registration by eliminating manual data entry. It uses optical character recognition to extract data from paper documents like IDs, forms, and referrals during registration. This improves accuracy and availability of registration information while reducing costs and frustration compared to paper-based processes. It ensures compliance and allows staff to focus on patient care rather than tedious tasks.
Sophia Wang has extensive experience in business intelligence and data analytics using Microsoft SQL Server and related technologies. She has expertise in ETL processes using SSIS, data modeling, reporting with SSRS and SSAS, and working with relational databases. She has worked on projects involving building data warehouses, reporting solutions, and developing complex SQL queries, stored procedures and other database objects. Her skills include MS SQL Server, SSIS, SSRS, SSAS, TSQL, data visualization and analytics using tools like R and SAS.
In light of Cloud Computing System CDA Generation and Integration for Health ...IJAEMSJORNAL
Theoretical Successful sending of Electronic Health Record enhances tolerant security and nature of care, however it has the essential of interoperability between Health Information Exchange at various doctor's facilities. The Clinical Document Architecture (CDA) created by HL7 is a center record standard to guarantee such interoperability, and engendering of this archive configuration is basic for interoperability. Lamentably, clinics are hesitant to receive interoperable HIS because of its organization fetched with the exception of in a modest bunch nations. An issue emerges notwithstanding when more healing facilities begin utilizing the CDA archive arrange on the grounds that the information scattered in various reports are difficult to oversee. In this paper, we portray our CDA report era and incorporation Open API benefit in light of distributed computing, through which doctor's facilities are empowered to advantageously create CDA archives without purchasing restrictive programming. Our CDA archive combination framework incorporates various CDA records per tolerant into a solitary CDA report and doctors and patients can peruse the clinical information in sequential request. Our arrangement of CDA report era and joining depends on distributed computing and the administration is offered in Open API. Engineers utilizing distinctive stages along these lines can utilize our framework to improve interoperability.
IRJET- A Review on Secured CDA Generation Based on Cloud Computing SystemIRJET Journal
This document discusses a proposed system for generating and integrating Clinical Document Architecture (CDA) documents based on cloud computing. The key points are:
1. The system would allow hospitals to conveniently generate standard CDA documents without needing proprietary software, by utilizing an open API CDA generation service hosted on the cloud.
2. It would also integrate multiple CDA documents for a single patient from different hospitals into a single document, allowing medical personnel to easily view a patient's clinical history chronologically.
3. The proposed system is designed to enhance healthcare data interoperability and encourage greater CDA adoption among hospitals by offering the CDA generation and integration services through a free cloud-based platform.
Information and Integration Management VisionColin Bell
The vision of the Information and Integration Management team at the University of Waterloo captured on a single 'poster' page. Covers: Data Management Environment, Mission + Vision, Information Asset Base, Information Lifecycle, Document Management, Metadata/Meaning, Integration Platform, and Innovation Platform.
Creating Value Through Computable Clinical DataRobert Bond
Computable clinical data has numerous high value applications including comparative effectiveness research, personalized medicine, and claims adjudication. I discuss some of the technologies involved in making patient data computable
This document proposes an Intelligent Data Entry and Acquisition (IDEA) system to help with on-site highway maintenance and construction. It describes an architecture using wearable computers and sensors to collect asset data in the field, process it using pattern recognition, and upload it to centralized databases. Field workers could use tools like digital notepads, cameras, and GPS to gather location-tagged images, notes and condition reports on assets, which the IDEA system would then analyze and integrate into maintenance planning databases back at the office. The goal is to streamline data collection and improve safety, productivity and data quality for tasks like infrastructure inspections.
6. real time integration with odi 11g & golden gate 11g & dq 11g 20101103 -...Doina Draganescu
Oracle Data Integration provides high performance, real-time data integration across heterogeneous systems and data sources. It uses a declarative design approach to simplify and accelerate data integration projects. Oracle GoldenGate enables low-impact change data capture and real-time data replication between databases for real-time data warehousing. Using Oracle Data Integration and GoldenGate together provides a best-in-class solution for real-time data integration and warehousing.
The document discusses challenges with preserving data from databases for long-term use as records. It defines key concepts like records, metadata, databases, and data warehouses. The document recommends creating a policy to identify which records and metadata must be retained long-term. It also suggests using application layers to export identified records and metadata to a recordkeeping system like a data warehouse or ERMDS. This would help ensure usable records are available over time even after source systems are decommissioned.
Document Classification for Microsoft Officejoseph978
Titus Labs Document Classification is a classification and policy enforcement tool that ensures all Microsoft Office documents are classified before they can be saved, printed, or sent via email. Labels are fully customizable to meet internal and regulatory marking standards, and can be used to indicate any type of information, including data sensitivity (Public, Confidential) and department (HR, Finance).
IBM WebSphere DataStage 8.0 includes several new features and enhancements, including a new WebSphere Metadata Server Foundation that better integrates IBM Information Server products and supports enterprise metadata services. DataStage now provides graphical impact analysis and job difference capabilities directly in the designer. Data quality is now fully integrated through QualityStage stages. Usability has also been improved through enhanced search capabilities and easier export/import of metadata. Performance monitoring and system resource estimation has also been improved.
CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...HMO Research Network
The document describes the CER Hub, an informatics platform for conducting comparative effectiveness research using electronic medical record data. The CER Hub allows researchers to develop standardized processors to generate research datasets from heterogeneous EMR systems. It facilitates collaborative projects to address questions like evaluating asthma control and smoking cessation treatments. Initial projects through the CER Hub involve developing measures of asthma control and comparing the effectiveness of treatment intensification options using EMR data from six health systems.
This document discusses leveraging Oracle's engineered systems for deploying and running Oracle Identity Management. It introduces Oracle's engineered systems, including Oracle Exalogic and Exadata, which are designed to improve performance, scalability, and simplify deployment for mission-critical applications like Oracle Identity Management. The document also summarizes the benefits of using Oracle Exalogic and Exadata, such as reduced costs, risk, and ability to consolidate hundreds of servers into a single system. It provides examples of large customers that have achieved significant performance and scalability running Oracle Identity Management on Oracle's engineered systems.
DBArtisan XE6 is a database administration tool that helps DBAs manage databases across platforms more efficiently. It streamlines common tasks, reduces errors, and provides comprehensive capabilities for data management. As data volumes grow, the role of the DBA is evolving to handle multiple concurrent responsibilities. DBArtisan facilitates this role by providing performance monitoring, space and data management tools, and security management in a single interface.
Sharing of PHR on Cloud Using Attribute Based Encryption and Access by QR-CodeIRJET Journal
This document proposes a framework for securely sharing personal health records (PHRs) stored in the cloud. It leverages attribute-based encryption to encrypt PHRs before outsourcing them to cloud servers. Access to the encrypted PHRs is granted through attributes assigned to users, and a QR code linked to the patient. The framework aims to give patients control over their medical records while enabling flexible, fine-grained access for authorized users like healthcare providers. It addresses challenges around privacy, scalability, access management and revocation of access rights for PHRs stored remotely in the cloud.
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...cscpconf
In today’s world, gigantic amount of data is available in science, industry, business and many
other areas. This data can provide valuable information which can be used by management for
making important decisions. But problem is that how can find valuable information. The answer
is data mining. Data Mining is popular topic among researchers. There is lot of work that
cannot be explored till now. But, this paper focuses on the fundamental concept of the Data mining i.e. Classification Techniques. In this paper BayesNet, NavieBayes, NavieBayes Uptable, Multilayer perceptron, Voted perceptron and J48 classifiers are used for the classification of data set. The performance of these classifiers analyzed with the help of Mean Absolute Error, Root Mean-Squared Error and Time Taken to build the model and the result can be shown statistical as well as graphically. For this purpose the WEKA data mining tool is used.
Engineering the Healthcare Technology TransitionGreenway Health
The document discusses how healthcare organizations will adopt new technologies required by healthcare mandates, categorizing them as Pragmatists, Collaborators, or Innovators. It also explains the differences between dual database and single database solutions for integrating ambulatory workflows like financial, operational, and clinical systems. A single database solution allows real-time updates across workflows while dual databases require manual updates between separate systems.
This document discusses maximizing returns from a data warehouse. It covers the need for real-time data integration to power business intelligence and enable timely, trusted decisions. It outlines challenges with traditional batch-based approaches and how Oracle's data integration solutions address these through products that enable real-time data capture and delivery, bulk data movement, and data quality profiling to build an enterprise data warehouse.
Data resource management involves applying information systems technology to manage data resources. It includes activities like creating, storing, organizing, and retrieving data using database management systems. There are different types of databases like operational, distributed, data warehouses, data marts, and end user databases. Data warehouses store historical data from various operational databases to help identify trends. Data mining techniques are used to better understand data through analysis, sorting, extracting patterns and relationships to gain insights. Common applications of data mining include banking, customer relationship management, targeted marketing, fraud detection, and scientific data analysis.
Trending use cases have pointed out the complementary nature of Hadoop and existing data management systems—emphasizing the importance of leveraging SQL, engineering, and operational skills, as well as incorporating novel uses of MapReduce to improve distributed analytic processing. Many vendors have provided interfaces between SQL systems and Hadoop but have not been able to semantically integrate these technologies while Hive, Pig and SQL processing islands proliferate. This session will discuss how Teradata is working with Hortonworks to optimize the use of Hadoop within the Teradata Analytical Ecosystem to ingest, store, and refine new data types, as well as exciting new developments to bridge the gap between Hadoop and SQL to unlock deeper insights from data in Hadoop. The use of Teradata Aster as a tightly integrated SQL-MapReduce® Discovery Platform for Hadoop environments will also be discussed.
TUW- 184.742 Data as a Service – Concepts, Design & Implementation, and Ecosy...Hong-Linh Truong
This presentation is part of the course "184.742 Advanced Services Engineering" at The Vienna University of Technology, in Winter Semester 2012. Check the course at: http://www.infosys.tuwien.ac.at/teaching/courses/ase/
This document discusses how bioprocessing companies have leveraged information technology (IT) solutions to help bring new therapies to market, but that maintaining complex IT infrastructure can become costly and inefficient. It argues that organizations need an enterprise-level informatics platform that integrates diverse data sources and systems to extract maximum value from research and development (R&D) data without overspending on IT. The document provides an example of how one pharmaceutical company benefited from a more holistic and integrated informatics approach based on the Accelrys Pipeline Pilot platform.
The document discusses a unified data architecture that enables any user to access and analyze any data type from data capture through analysis. It describes using a discovery platform to enable interactive data discovery on structured and unstructured data without extensive modeling. It also describes using an integrated data warehouse for cross-functional analysis, shared analytics, and lowest total cost of ownership. Finally, it provides examples of using the architecture for IPTV quality of service analysis, including predictive models using decision trees and naive Bayes.
Microsoft SQL Server 2012 Master Data ServicesMark Ginnebaugh
Mark Gschwind, VP of Business Intelligence at DesignMind, gave a presentation on Master Data Services (MDS) in SQL Server 2012. He began with an overview of master data and its importance for central curation, quality management, and ease of access for business users. He then reviewed the key capabilities of MDS, including modeling, validation, stewardship, and integration. Gschwind demonstrated creating an MDS model, using the new Excel interface, business rules, and exposing MDS data to a data warehouse. He concluded with tips for successful MDS implementations such as starting small, engaging business users, and using the development environment.
Oracle's BigData solutions consist of a number of new products and solutions to support customers looking to gain maximum business value from data sets such as weblogs, social media feeds, smart meters, sensors and other devices that generate massive volumes of data (commonly defined as ‘Big Data’) that isn’t readily accessible in enterprise data warehouses and business intelligence applications today.
Creating Data Hubs to Enhance Information SharingInnoTech
Data Bridge Management creates data hubs to enhance information sharing between organizations. It integrates data sharing and analytic systems through a centralized "Data Bridge" information sharing hub. This reduces losses from threats while saving lives and resources by enabling accurate information access and analysis. However, information sharing faces challenges around communication, access to data across different platforms and formats, and disconnected data "stovepipes". Data Bridge Management's solution is a brokered system that enforces agreements to enable secure access to various structured and unstructured data sources in a centralized way. It has implemented regional resiliency and emergency response hubs as well as proposed technology transfer and supply chain security hubs.
Introduction to Microsoft’s Master Data Services (MDS)James Serra
Master Data Services is bundled with SQL Server 2012 to help resolve many of the Master Data Management issues that companies are faced with when integrating data. In this session, James will show an overview of Master Data Services 2012, including the out of the box Web UI, the highly developed Excel Add-in, and how to get started with loading MDS with your data.
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Calpont Corporation
Matt Aslett of 451 Research discussed the rise of analytic platforms and their role in enabling exploratory analytics on large datasets. Bob Wilkinson from Calpont then presented on InfiniDB, Calpont's columnar analytic platform that provides scalable and fast performance for complex queries. InfiniDB was shown to accelerate analytics for telecommunications customer experience data and online advertising attribution. The discussion highlighted how InfiniDB supports flexible schemas and a spectrum of analytic approaches to enable exploratory analysis on structured data.
Database Architechs is a database-focused consulting company for 17 years bringing you the most skilled and experienced data and database experts with a wide variety of service offering covering all database and data related aspects.
Aras Vision and Roadmap with Aras Innovator PLM SoftwareAras
The document discusses Aras's product roadmap and vision. It outlines Aras's goals of developing a premier, connected, open, and community-driven PLM platform. The roadmap includes expanding client support, improving the user interface, integrating social features, and offering packaged and cloud-based solutions. It also details plans to enhance existing capabilities in areas like requirements management, ERP integration, and application lifecycle management.
An information architecture (IA) provides a holistic view of how information flows through an enterprise, including data architecture, master data management, metadata management, business intelligence, data quality, and data governance. Sybase PowerDesigner is an integrated information architecture tool that addresses issues with traditional fragmented data modeling approaches. It links all architectural layers and perspectives, including data models, business processes, and technical components. This allows organizations like BBVA and MedStar Health to improve integration, reduce costs, and accelerate projects through a unified information architecture approach.
Business intelligence technologies provide finance departments with historical, current, and predictive views of business operations based on integrated data from across the organization. This allows finance to move beyond reporting and into advising the business by revealing trends, patterns, and insights. For example, AT&T Mobility's finance department leverages a Teradata enterprise data warehouse to calculate the profitability of 80 million subscribers daily and identify any subscribers who churned so they can work to win them back. Implementing BI requires getting sponsorship, understanding needs, building partnerships, standardizing data, and delivering self-service tools and timely detailed data to support strategic financial analysis and decision making.
The document discusses Oracle Optimized Storage and its advantages. It highlights how Oracle Optimized Storage simplifies IT environments by tightly integrating Oracle software and storage hardware. It also notes how Oracle Optimized Storage can increase functionality, permanently shrink storage capacity, consolidate systems to reduce complexity, improve performance, and lower costs.
This document describes the design of an online medical diagnosis system called Doctors-Online. The system allows patients to have video consultations with doctors by transmitting vital health data like temperature via Bluetooth. It uses a 3-tier architecture with presentation, business, and data layers. Doctors can view patient records, diagnose issues, and send prescriptions during secure video conferences. The system aims to provide convenient remote healthcare.
This document provides an overview and agenda for a presentation on big data landscape and implementation strategies. It defines big data, describes its key characteristics of volume, velocity and variety. It outlines the big data technology landscape including data acquisition, storage, organization and analysis tools. Finally it discusses an integrated big data architecture and considerations for implementation.
The Comprehensive Approach: A Unified Information ArchitectureInside Analysis
The Briefing Room with Richard Hackathorn and Teradata
Slides from the Live Webcast on May 29, 2012
The worlds of Business Intelligence (BI) and Big Data Analytics can seem at odds, but only because we have yet to fully experience comprehensive approach to managing big data – a Unified Big Data Architecture. The dynamics continue to change as vendors begin to emphasize the importance of leveraging SQL, engineering and operational skills, as well as incorporating novel uses of MapReduce to improve distributed analytic processing.
Register for this episode of The Briefing Room to learn the value of taking a strategic approach for managing big data from veteran BI and data warehouse consultant Richard Hackathorn. He'll be briefed by Chris Twogood of Teradata, who will outline his company's recent advances in bridging the gap between Hadoop and SQL to unlock deeper insights and explain the role of Teradata Aster and SQL-MapReduce as a Discovery Platform for Hadoop environments.
For more information visit: http://www.insideanalysis.com
Watch us on YouTube: http://www.youtube.com/playlist?list=PL5EE76E2EEEC8CF9E
Similar to An Eye on the Future A Review of Data Virtualization Techniques to Improve Research Analytics RICHTER (20)
New Rules Dealing with Conflicts of Interest in Public Health Service Funded ...HMO Research Network
The new federal regulations regarding financial conflicts of interest in research expand disclosure requirements for researchers. Key differences from the 1995 regulations include lowering the financial disclosure threshold, requiring disclosure of all reimbursed travel, and mandating public access to conflict of interest information. Institutions must now determine if researchers' financial interests constitute actual conflicts and develop management plans if needed. Complying with the new rules requires informing stakeholders, revising policies and procedures, and addressing implementation challenges around identifying researchers, assessing financial interest relationships to research, and policy infrastructure.
The document discusses the mission and goals of the Patient-Centered Outcomes Research Institute (PCORI). PCORI was established through the Affordable Care Act to help patients make informed healthcare decisions through comparative clinical effectiveness research. The organization aims to produce high-quality evidence by involving patients and caregivers throughout the research process, from developing research questions to disseminating results. PCORI also seeks to address treatment heterogeneity and improve outcomes for various patient subgroups through its research.
Evaluation of the Validity of the Gestational Length Assumptions Based Upon A...HMO Research Network
The study evaluated the validity of an algorithm to estimate gestational length and determine prenatal medication exposure based on electronic health plan data. The algorithm underestimated gestational length by an average of 5.5 days and underestimated the prevalence of preterm births compared to measures from linked birth certificates. The algorithm correctly classified exposure status for most women taking antidepressants but had poorer performance for antibiotics due to their sporadic use. While the algorithm provided reasonable estimates, its accuracy may vary for other medications beyond the two antidepressants and two antibiotics evaluated in this study.
Comparative Safety of Infliximaband Etanercept on the Risk of Serious Infecti...HMO Research Network
This study assessed whether the relative risk of serious infections from infliximab versus etanercept varies by patient characteristics. The study used data from a large US claims database to compare rates of serious infection requiring hospitalization in patients initiating infliximab or etanercept. Cox models adjusting for potential confounders were used to estimate hazard ratios for serious infection associated with each treatment overall and stratified by age, sex, race, BMI, and smoking status. The results suggested the increased risk of serious infection with infliximab compared to etanercept may be modified by age, with a higher risk seen only in patients under 65, but not other characteristics examined. Limitations included potential residual confounding and exposure
A Multi State Markov Model for Analyzing Patterns of Use of Opiod Treatments ...HMO Research Network
This document describes a study that used a multi-state Markov model to analyze patterns of use of three opioid dependence treatments: methadone, buprenorphine, and other drug-free outpatient treatments. The study followed 30,080 Medicaid patients diagnosed with opioid dependence from 2003 to 2007. The model estimated transition probabilities between the three treatment states and discontinuation over time. Results found the probability of discontinuing methadone treatment was 20% after 6 months, while survival probabilities on each treatment after 30 months were 30% for methadone, 9% for buprenorphine, and 5% for other treatments. The study concluded multi-state Markov models are useful for investigating treatment switching and compliance over long
A Descriptive Study of Vaccinations Occuring During Pregnancy HENNINGERHMO Research Network
This study analyzed vaccination rates during pregnancy within the Vaccine Safety Datalink population between 2002-2006. It found that the influenza vaccine was the most commonly administered vaccine as recommended, with a rate of 98 doses per 1,000 pregnancies. Vaccines considered conditionally recommended if otherwise indicated, such as tetanus and hepatitis B, were administered at lower rates. Some contraindicated vaccines like MMR and varicella were also administered, more often in the first trimester likely due to initial unawareness of pregnancy. The study concludes clearer recommendations and provider education could help reduce unintended vaccination during pregnancy.
The Use of Administrative Data and Natural Language Processing to Estimate th...HMO Research Network
This study aimed to evaluate the use of administrative data to identify cases of statin-related rhabdomyolysis and determine the risk associated with high-dose simvastatin use. The researchers identified 29 validated cases of rhabdomyolysis among statin users using multiple methods including ICD-9 codes, lab data, and natural language processing of medical records. They found a higher incidence with simvastatin use, especially doses of 80mg or more daily, compared to other statins. While administrative data can provide estimates of rare adverse events, medical record review is still needed to validate potential cases.
Patient Views of KRAS Testing for Treatment of Metastatic Colorectal Cancer L...HMO Research Network
This document summarizes findings from a study examining patient understanding of KRAS testing and its relationship to treatment decisions for metastatic colorectal cancer (mCRC). Most participants did not recall having the KRAS test but generally had positive views of genetic testing. While the KRAS test provided guidance on anti-EGFR therapy eligibility, imaging scans and physicians' treatment recommendations seemed more salient to patients' experiences. Patients prioritized physician expertise over test results and preferred physicians make treatment decisions.
Comparative Effectiveness of Chemotherapy Regimens for Advanced Lung Cancer C...HMO Research Network
1) The study examined the comparative effectiveness of chemotherapy regimens for advanced lung cancer patients in terms of survival, number of hospitalizations, and total hospital days.
2) It found that patients receiving triplet therapy (combination of 3 drugs) had longer survival and fewer hospitalizations and hospital days than those receiving doublet (2 drugs) or singlet (1 drug) therapies.
3) The distribution of chemotherapy regimens changed over time, with more patients receiving triplet and targeted therapies like erlotinib in later years, which may impact future comparative effectiveness findings.
This study applied doubly robust estimation to assess the causal effect of angiotensin converting enzyme inhibitors (ACEIs) versus angiotensin receptor blockers (ARBs) on follow-up hemoglobin (Hgb) levels. The study found evidence of confounding factors like heart failure status and sex that differed between treatment groups and affected Hgb levels. Doubly robust estimation was used to estimate average causal effects while addressing confounding. The results suggested that average follow-up Hgb levels may be higher when ARBs rather than ACEIs are prescribed, though the mean difference was small and not clearly clinically significant. Further analysis was recommended to refine the models.
Risk Factors for Short Term Virologic Outcomes Among HIV Infected Patients Un...HMO Research Network
This study investigated risk factors for virologic outcomes among HIV patients who switched combination antiretroviral therapy regimens. The study found that about 24% of patients failed to achieve maximal viral suppression 6 months after switching regimens. Younger age, lower CD4 counts, heterosexual transmission risk, NRTI-only regimens, and previous virologic failure were associated with increased risk of advanced virologic failure. New class-based regimens were protective against low-level viremia. Rates of treatment failure decreased in more recent calendar years.
Expanding SEER Reporting with Comorbidity Data Colorectal Cancer HORNBROOKHMO Research Network
This document discusses adding comorbidity data from Kaiser Permanente Hawai'i members to the Hawai'i SEER cancer registry. It achieved a high match rate between KPH tumor records and those in the SEER registry. Analysis found that 37% of cancers had comorbidities, with diabetes, COPD, and complications of diabetes being most common. Adding comorbidity data would provide useful context for understanding cancer treatment and outcomes. Addressing privacy issues is important for full data sharing between organizations.
Drug Characteristics Associated with Medication Adherence Across Eight Diseas...HMO Research Network
This study analyzed medication adherence across 8 disease states based on drug characteristics. Researchers found that adherence varied significantly based on whether drugs were generic or brand name, with generic drugs generally having higher adherence rates. Adherence also decreased as out-of-pocket costs for patients increased. When examining specific conditions, adherence was highest for hypertension drugs and lowest for asthma/COPD medications. Certain drug classes within conditions demonstrated different adherence levels as well. Overall, the study identified drug characteristics and costs as important factors influencing medication adherence across multiple diseases.
Feasibility of Implementing Screening Brief Intervention and Referral to Trea...HMO Research Network
This document summarizes plans to implement SBIRT (Screening, Brief Intervention, and Referral to Treatment) screening for risky alcohol and substance use at Kaiser Permanente Colorado (KPCO). It describes two pilot programs - the first involved stakeholder input and the second aims to implement SBIRT at one clinic. Challenges include engaging busy providers, developing workflows, and integrating SBIRT into existing initiatives and roles. Evaluation of the second pilot will assess screening rates and compare outcomes between clinics with and without behavioral health support. Addressing stakeholder concerns is key to successful implementation.
eCare for Heart Wellness A Trial to Test the Feasibility of Web Based Dietici...HMO Research Network
The e-Care for Heart Wellness study tested a web-based dietitian intervention to improve blood pressure control. The study randomized 101 patients to either usual care or web-dietitian care. Patients in the web-dietitian group received home monitoring equipment, an in-person visit with a dietitian, and weekly web communications with a dietitian. Compared to usual care, the web-dietitian intervention led to greater weight loss, increased fruit and vegetable consumption, and higher rates of blood pressure control and weight loss of over 4 kg. Patients found the heart age calculation and collaborative web-based dietitian care helpful for motivating lifestyle changes.
A Telephone Based Diabetes Prevention Program and Social Support for Weight L...HMO Research Network
This study evaluated a telephone-based diabetes prevention program called Call-2-Health that provided behavioral counseling and social support for weight loss. The study involved a randomized controlled trial comparing the Call-2-Health intervention to usual care among 47 adults at risk for diabetes. The intervention led to significantly greater weight loss and increased social support for healthy eating compared to usual care. Increased social support, especially from family, was strongly correlated with greater weight loss. The results provide preliminary evidence that a telephone-based program can effectively promote weight loss and diabetes prevention.
Technological Resources & Personnel Costs Required to Implement an Automated ...HMO Research Network
This document describes a study to develop an automated system to provide primary care physicians (PCPs) information about their patients' care during transitions from hospitals to home. The system used the electronic medical record (EMR) from an outpatient medical group practice to receive discharge information from local hospitals via an ADT interface and send notifications and care recommendations to PCPs. The study found it was feasible to create such a system but required substantial physician time as well as strong internal informatics expertise and cooperation from external facilities to link records.
Online Patient Access to their Medical Record and Health Providers is Associa...HMO Research Network
This study investigated the impact of patients having online access to their medical records through Kaiser Permanente's MyHealthManager (MHM) system. The results showed that MHM users had a 18% higher rate of office visits and 9% higher rate of phone visits compared to non-users. Rates of after-hours clinic visits, emergency department visits, and hospitalizations also increased. While increased access seemed to lead to greater use of healthcare services, further research is still needed to understand the reasons and outcomes.
Documentations of Advanced Heath Care Directives Where Are They TAI_SEALEHMO Research Network
The document discusses a study examining where patients' advance health care directives (AHCDs) are located within an electronic health record (EHR) system, and how easily accessible and actionable they are. The study found that while many patients had AHCDs documented, they were often only listed in the problem list and not accompanied by scanned documents, making them difficult to find and not truly actionable. Certain patient characteristics like older age, dementia and cancer were associated with having more accessible AHCDs. The implications are that standardizing and improving the accessibility of AHCDs in EHRs could help ensure patients' end-of-life wishes are honored.
A Simulated Diabetes Learning Intervention Improves Provider Knowledge and Co...HMO Research Network
A simulated learning intervention using interactive virtual patient cases improved provider knowledge and confidence in managing diabetes compared to a control group. 341 resident physicians were randomized to either an early intervention group that completed 12 virtual patient cases over 8 months or a late intervention group that served as a control. The early intervention group scored higher on a diabetes knowledge survey and reported greater confidence and knowledge in diabetes management topics compared to the control group. The results provide evidence that simulated learning can help transfer knowledge to actual patient care.
A Simulated Diabetes Learning Intervention Improves Provider Knowledge and Co...
An Eye on the Future A Review of Data Virtualization Techniques to Improve Research Analytics RICHTER
1. An eye on the future: A review of Data
Virtualization Techniques to improve research
Presenter: Jack A Richter
Authors: Jack A Richter, Lela McFarland, Christine Bredfeldt, Ph.D
Contributors:
Rajesh V DevKaran (KPIT), Wayne Little (KPIT), Sriram Thiruvenkatachari (KPIT),
Sean Mikha (Teradata), Steven C Werntz (KPGA) .,
MID-ATLANTIC PERMANENTE RESEARCH INSTITUTE
Change History v7.1 - added links to slide 14 (vendor info) and corrected spelling errors - added comments to slides 2 and 13
Quote From the "BeyeNETWORK : Comprehensive resources for business intelligence and data warehousing professionals" Web Link : http://www.b-eye-network.com/view/14815 =========================================================================================== Rick van der Lans Rick F. van der Lans is an independent consultant, author and lecturer specializing in XML, data warehousing, application integration, and information modelling. He is Managing Director of R20/Consultancy based in The Netherlands. Rick has advised many large companies worldwide on defining their data warehouse architectures. He is chairman of the Database Systems Show (organized annually in The Netherlands since 1984), and he is columnist for two major newspapers in the Benelux, "Computable" and "DataNews". Additionally, he is advisor for magazines such as Software Release Magazine and Database Magazine. ============================================================================================ Leading DV Vendor reports A top 5 global bank used DV to create SOA data services layer across 200+ sources and 20+ applications. 250% ROI in 3 months elapsed time, 2% revenue increase within the business unit, 50-60% reduction in integration design and development time for new applications and portals, 25% increase in object reuse for downstream BI reporting projects. A top 3 global pharmaceutical firm used DV to quickly prototype, develop and deploy the new information solutions required to support strategic business decision 90% reduction in time to create a new report, 10X faster time-to-business-value for new views and applications, 200% ROI in 3 months elapsed time, 100% increase in key business analyst productivity, and 5% improvement in R&D project on-time delivery. Cable and Home Internet Service Provider – used DV to federate data from different source - data, e-mail data, credential data, etc into one virtual layer Reduced transaction time for processing a request from average of 5 seconds down to 1 second, significantly reduced manpower to manage the systems by 20% (development costs reduced), reduced time-to-market for deploying new applications by 25%. ** Notes & Slide Source: PPT Presentation - Data Virtualization .. Data Execution Strategy & Architecture Presentation 0/27/2011 by: Wayne Little & Raj Devkaran (KPIT)
Some Key features of Data Virtualization Multiple data delivery methods to the consuming applications SOA data services ODBC/JDBC REST Embedded Metadata in the Virtual Platform Availability of re-usable Data Quality and Data Transform definitions Scheduling & pre-processing Data Caching Data Discovery Rapid Prototyping DV enables integration of structured (databases), semi-structured (spreadsheets) and non-structured (weblogs, pdf file) data. Data can be cached for purposes of pre-fetching static or large data. Cache jobs are usually scheduled while DV jobs are usually realtime. Mobile devices can also "connect" to the virtual schema. Data Warehouse extension and virtual data marts are some other uses.
Concerns Timing Costs (Hardware, Programming, Support) Constant redesign of database and data movement (ETL) processes Data Quality issues introduced within the process Benefits Single Source of data – Consistent view of the data for all users
Concerns Same issues as the EDW Data duplicated between Data Marts Inconsistent ETL processes/Business rules may be applied If sourced from EDW may not be possible to trace back to true data source Benefits Allows for consistent interpretation of the data sets involved ETLs can be subject area specific Smaller datasets improve performance
Concerns Usually requires all databases be from the same vendor May Require users to know the relationships of data between different databases Performance issues (network, source system, end user) Benefits Allows access to data across databases without physical movement of the data No ETLs and Minimal IT involvement
Concerns Requires all data to be in the same database unless Distributed Database Links or something similar is used. Usually requires knowledge of the data, data relationships and rules May introduce performance issues (can be “materialized” to resolve this issue) May require movement of the view dataset Benefits Insulates the user from physical database changes. Allows for centralized consistent application of business rules Allows for centralized consistent creation derived data elements Makes data queries easier to write - less joins Fixes need to be applied only once – in the view Simple Example create view pat_names as SELECT pat_mrn, pat_name FROM base.patient WHERE patient_type <> ‘TEST’; Complex Example: Create view VDW_DEATH as SELECT smrn.PERSON_ID as MRN, rdd.DEATHDT as DeathDt, rdd.DTIMPUTE as DtImpute, rdd.COD_DX as UnderCOD, rdd.CODETYPE as Codetype, rdd.SOURCE as Source, rdd.SOURCE_CONFIDENCE as Confidence, rdd.DEATH_ID as Local_Death_ID FROM RSRCH_DEATH_DETAILS as rdd INNER JOIN RSRCH_ID_PAT_PERSON as smrn ON smrn.VDW_EXCLUDED_FLAG='N‘ and rdd.PAT_ID = smrn.PAT_ID AND rdd.LINE=1;
Problems Addressed by Data Virtualization Data Challenges Rapid data proliferation Increasing complexity & volume Data needs beyond structured Data duplication & inconsistency Silo’d data approaches Dollars & Delivery Speed Shrinking budgets Need for personalized real-time Health Care data Classic delivery methods too slow Increasing regulations Lack of agility to timely meet business needs Value Proposition for Data Virtualization Faster, More Agile Delivery Quicker time to market – virtual rather than physical integration Updating data views easier/faster than corresponding physical changes for DB/DW/DM Single Unified Source of Consistent On-Demand Data Access Re-use of consistent rules: data cleansing, business rules, security rules, etc. Common “virtual schema” with unified metadata and definitions Uniform data integration for multiple workflows and down-stream consumers: Data Services to SOA Services ( foundational data support for SOA Strategy ) ETL Reporting & Analytics Web Portals Combine multiple data sources, types & formats into a complete and unified set Structured to unstructured data Internal to external data (including cloud) Operational systems to DWs to Data Marts to Web Applications
Picture – free usage from http://freebigpictures.com/clouds-pictures/cloud-sea/
AN INTERNET BASED (i.e. Cloud) DATABASE HOSTING SERVICE - A service which hosts the users’ database on a remote system that can be accessed via the internet. The database support security and management is performed by the service vendor. It may or may not present a virtualized data view. It may require user software in addition to an internet browser to access the data. Some Vendors: SimpleDB – NoSQL (Amazon) ClearDB (CloudDB) CouchDB – NoSQL (CouchOne) Xeround - Sql*Server (Amazon) AppEngine (Google)\\ Database.com AN INTERNET BASED (i.e. Cloud) HOSTING SERVICE - A database independent service which presents the users’ database(s) to internet users.. The database support security and management is performed by customer. The tool may or may not present a virtualized data view. It may require user software in addition to an internet browser to access the data.
Cloud Database as a Generic Database Access Service This definition of a Cloud database is one where a database access service connects users with databases hosted on the internet (i.e. Cloud based databases). These databases may be hosted by the vendor running the access site or by others. Generally the access follows a format wherein the user does not need to know the database format nor method in which the database is implemented. SaaS – Software as a Service IasS – Infrastructure as a Service or Information as a Service PaaS – Platform as a Service
The Forrester Wave™: Data Virtualization, Q1 2012 Informatica, IBM, Composite Software, And Denodo Technologies Lead, With SAP, Microsoft, Oracle, Stone Bond, And Red Hat Close Behind by Noel Yuhanna, Mike Gilpin with Adam Knoll http://www.forrester.com/search?#/The+Forrester+Wave+Data+Virtualization+Q1+2012/quickscan/-/E-RES60746 ------------------------------------------------------------------------------------------------------ Article Abstract: In Forrester's 53-criteria evaluation of data virtualization — also known as information-as-a-service (IaaS) — vendors, we found that Informatica, IBM, Composite Software, and Denodo Technologies lead the pack because of strong enterprise-class data virtualization features and functionality such as real-time integration, data quality, transformation, caching, and modeling. SAP, Microsoft, Oracle, Stone Bond Technologies, and Red Hat are Strong Performers; each offers a viable option to support particular use cases. Although Oracle no longer positions itself in the data virtualization market, Forrester evaluated its solution as a nonparticipating vendor. Red Hat continues as the most substantial vendor supporting an open source data virtualization solution. This market has sufficiently matured in that its Leaders include large well-established platform vendors, but it continues to exhibit important innovation from smaller players more exclusively focused on information virtualization and federation.