SlideShare a Scribd company logo
1 of 4
Download to read offline
Data Warehouse Optimization
using Hadoop
Today’s escalating data volumes can prevent Online Transaction Processing (OLTP)
systems from generating and processing transactions efficiently which can cause
Data Warehouse (DW) performance issues when querying data. Total Cost of
Ownership (TCO), too, escalates rapidly because of the need for upgrades to DW
hardware and for additional licenses since organizations often pay more to store
information simply because they might need it.
Radically new capabilities are needed to enable DW and OLTP to function cost-
effectively in this new environment. First, organizations need to cope with data
volumes that traditional DW platforms (whether RDBMSs1
or appliances) were
never designed for. Second, they must deal with unstructured, semi-structured and
structured data.
Big data technologies such as Apache Hadoop excel at managing large volumes of
unstructured data. However, this data needs to be pulled together with structured
data for analytical purposes. As Big Data technologies and Apache Hadoop are
coming into mainstream use, being able to integrate these new technologies with
existing legacy Data Warehouse platforms to get the best of both worlds is key.
A new solution from
Capgemini,
implemented in
partnership with
Informatica, Cloudera
and Appfluent,
optimizes the ratio
between the value of
data and storage
costs, making it easy
to take advantage of
new big data
technologies.
1	Relational Database Management System
Our solution: seamlessly combine the DW with
big data technologies
In partnership with Informatica, Cloudera and Appfluent, Capgemini has developed
an integrated solution that allows OLTP systems and DWs to serve their primary
functions efficiently and cost-effectively. Data Warehouse Optimization (DWO) using
Hadoop incorporates Cloudera’s highly available massively-parallel processing
Hadoop Platform, with minimal effort to integrate it in the existing landscape.
Appfluent Visibility is used to identify what data needs to be archived and
ETL/ELT2
processes should be offloaded. Informatica Information Lifecycle
Management (ILM) offloads onto Hadoop the data and processing that is not needed
for day-to-day functions from OLTP and DW systems. When required, archived data
is made available to the primary systems in various ways. Front-line services can
seamlessly retrieve historical information or use Informatica ILM to restore archived
data to production applications.
Elements of our solution
DWO using Hadoop consists of elements from Informatica, Appfluent and Cloudera;
integrated into a single solution delivered by Capgemini. Cloudera provides a
complete, tested and widely deployed open source distribution of Apache Hadoop,
making it available for mainstream adoption.3
Informatica Big Data Edition helps clients to start using Hadoop quickly to manage
and analyze unstructured and semi-structured data. Users are shielded from
complexity so the need for Hadoop-specific skills is limited.
Informatica BigData edition to create ETL/ELT (including complex transformation, DQ
rules, profiling, parsing and matching) framework and push all heavy lifting ETL/ELT
processing to Hadoop environment.
Informatica ILM Archive and Informatica Data Services (IDS) together help to allocate
data optimally between the DW and Hadoop, and then make it available to users in a
rapid and seamless manner, wherever it is stored.
Appfluent Visibility software applies statistical analysis to usage and activity metrics
in order to identify dormant data that is a candidate for archiving by Informatica ILM
Archive. Hadoop data is held in a highly compressed form, avoiding the need for a
regular compression process.
IDS build virtualized data objects that combine data from Hadoop archives and from
DWs residing on RDBMSs or appliances. You can get an instant response to your
immediate data needs, regardless where the data is stored.
2	ETL: Extract, Transform & Load; ELT: Extract, Load & Transform
3	Please see our brochure Capgemini’s Data Optimization for the Enterprise with Cloudera for more
details. http://www.capgemini.com/resources/capgeminis-data-optimization-for-the-enterprise-with-
cloudera
DWO using
Hadoop
incorporates
elements from
Informatica,
Appfluent
and Cloudera;
integrated into
a single solution
delivered by
Capgemini.
the way we do itData Warehouse
Capgemini expertly delivers DWO using Hadoop –
you decide how
As a market leader in big data and business information management, we have the
experts to guide you plus the techniques to ensure a successful transformation into a
high-performing, information-centric business. We have already helped many clients
evolve their legacy information architectures to exploit massive data volumes and
new data types.
Below are a few examples of the ways we have helped clients get started.
Strategic Value Assessment (SVA)
In a few weeks, Capgemini can help clients build a phased plan for implementing ILM
on Hadoop while identifying key opportunities to optimize the information ecosystem
across OLTP, DW and other information assets within the organization. The SVA
can also help build a business value proposition and budget to help obtain approval
and funding. The phased plan often includes an initial proof of concept (POC) or
prototyping phase to help the client understand the approach’s effectiveness and
test the business case with minimal risk.
Figure 1: Seamlessly combine the Data Warehouse with big data technologies
Data Sources Data Warehouse Layer Semantic
Layer
Data Services
Life Cycle
Management
ETL/ELT
ET/ELTL
DataArchive
Business
Intelligence
Layer
Data WarehouseDB Files
Enterprise Data
Hub
EDH
Big Data Edition
Man
aged Op
en
Se
cure
Gove
rned
 Informatica ILM Archive
to archive data on Hadoop
with compression
 Informatica Data
Services to build
virtualized data objects
combining data from
Teradata & Hadoop
 Informatica BigData
edition to execute data
integration
transformations, data
quality rules, profiling,
parsing, and matching all
natively on Hadoop
 Appfluent Visibility to
identify dormant data to be
archived and ETL/ELT
processes to be offloaded
by monitoring data usage
and analyzing activities
 Cloudera’s complete,
tested and widely
deployed open source
distribution of Apache
Hadoop, makes it available
for mainstream adoption
Profile Parse
Cleanse Match
ETL
The information contained in this document is proprietary. ©2014 Capgemini.
All rights reserved. Rightshore®
is a trademark belonging to Capgemini.
About
Capgemini
With more than 130,000 people
in over 40 countries, Capgemini
is one of the world’s foremost
providers of consulting,
technology and outsourcing
services. The Group reported
2013 global revenues of EUR
10.1 billion.
Together with its clients,
Capgemini creates and delivers
business and technology
solutions that fit their needs and
drive the results they want. A
deeply multicultural organization,
Capgemini has developed
its own way of working, the
Collaborative Business
ExperienceTM
, and draws on
Rightshore®
, its worldwide
delivery model.
For further information visit
www.capgemini.com/bim
or contact us at
bim@capgemini.com
MCOS_GI_AH_20140506
POC or prototype
We recommend choosing a low-complexity, high-value POC that can be completed
quickly. The objective is to create a working model of one representative case.
We leverage our relationship with Informatica and Cloudera to conduct POCs at
an optimized cost, and can bundle the total cost into one transaction to avoid the
complexity of dealing with multiple vendors.
Managed implementation
We can deliver your DWO solution out of our Big Data Service Center: a
comprehensive framework leveraging our Rightshore®
approach, and bringing
together big data strategy governance and organization with an industrialized, high-
performance delivery and support capability.
Benefits of DWO using Hadoop
DWO using Hadoop has several key advantages, starting with improved OLTP and
DW performance. Other benefits include:
TCO reduction: DW upgrade costs are avoided and license costs for existing DWs
are reduced because less data needs to be stored there. Commodity hardware and
software is used for archived data, lowering infrastructure costs – and archives can
be compressed by up to 90%.
Better decision support: All the information you need to make decisions is readily
available, together with a full range of analytics. Intelligent archiving combined with
virtualization gives optimum performance. Unstructured and structured data can be
combined for inclusion in any report or dashboard.
Historical data comes to the forefront of analysis: Respond fast to government
and regulatory requests for data.
Increased return on existing investment: You build on the technology you
already have, rather than replacing it - your DW becomes future-proof, with
infrastructure that scales to support the required volumes. Existing ETL skills can be
applied to new technologies.
BI flexibility and stability: A single abstract layer supports any future BI
visualization tools and makes it easy to add information in future. There is no change
to the business definitions and programming logic of the existing BI structure.
Better data security and governance: Access to data is tightly controlled via
rules applied at the semantic layer, and is auditable at a detailed level. Archive files
are available only to authorized users. Security and retention policies can be defined
across the data.
Find out more
Contact us to learn more about how we can provide Data Warehouse Optimization
using Hadoop for your organization to improve OLTP and DW performance.
About Informatica
Informatica Corporation (Nasdaq:INFA)
is the world’s number one independent
provider of data integration software.
Organizations around the world rely on
Informatica to realize their information
potential and drive top business
imperatives. Informatica Vibe, the
industry’s first and only embeddable
virtual data machine (VDM), powers the
unique “Map Once. Deploy Anywhere.”
capabilities of the Informatica Platform.
Worldwide, over 5,000 enterprises
depend on Informatica to fully leverage
their information assets from devices
to mobile to social to big data residing
on premise, in the Cloud and across
social networks.
For more information, visit
www.informatica.com

More Related Content

What's hot

Telecom Provider Case Study | Big Data Adoption | Diyotta
Telecom Provider Case Study | Big Data Adoption | DiyottaTelecom Provider Case Study | Big Data Adoption | Diyotta
Telecom Provider Case Study | Big Data Adoption | Diyottadiyotta
 
Capitalize on Big Data Through Hitachi Innovation
Capitalize on Big Data Through Hitachi InnovationCapitalize on Big Data Through Hitachi Innovation
Capitalize on Big Data Through Hitachi InnovationHitachi Vantara
 
Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!Jeffrey T. Pollock
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata Hortonworks
 
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightBig Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightHortonworks
 
Hitachi Cloud Solutions Profile
Hitachi Cloud Solutions Profile Hitachi Cloud Solutions Profile
Hitachi Cloud Solutions Profile Hitachi Vantara
 
Big data and apache hadoop adoption
Big data and apache hadoop adoptionBig data and apache hadoop adoption
Big data and apache hadoop adoptionfaizrashid1995
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseJeffrey T. Pollock
 
Unifying Big Data Integration | Diyotta India
Unifying Big Data Integration | Diyotta IndiaUnifying Big Data Integration | Diyotta India
Unifying Big Data Integration | Diyotta Indiadiyotta
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data LakeVMware Tanzu
 
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelMoving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelDataWorks Summit
 
Hitachi white-paper-storage-virtualization
Hitachi white-paper-storage-virtualizationHitachi white-paper-storage-virtualization
Hitachi white-paper-storage-virtualizationHitachi Vantara
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksHortonworks
 
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecturemark madsen
 
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Hortonworks
 
Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group Hortonworks
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Barijaxconf
 

What's hot (20)

Telecom Provider Case Study | Big Data Adoption | Diyotta
Telecom Provider Case Study | Big Data Adoption | DiyottaTelecom Provider Case Study | Big Data Adoption | Diyotta
Telecom Provider Case Study | Big Data Adoption | Diyotta
 
Capitalize on Big Data Through Hitachi Innovation
Capitalize on Big Data Through Hitachi InnovationCapitalize on Big Data Through Hitachi Innovation
Capitalize on Big Data Through Hitachi Innovation
 
Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!
 
Oracle's BigData solutions
Oracle's BigData solutionsOracle's BigData solutions
Oracle's BigData solutions
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
 
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightBig Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
 
Hitachi Cloud Solutions Profile
Hitachi Cloud Solutions Profile Hitachi Cloud Solutions Profile
Hitachi Cloud Solutions Profile
 
Big data and apache hadoop adoption
Big data and apache hadoop adoptionBig data and apache hadoop adoption
Big data and apache hadoop adoption
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San Jose
 
Unifying Big Data Integration | Diyotta India
Unifying Big Data Integration | Diyotta IndiaUnifying Big Data Integration | Diyotta India
Unifying Big Data Integration | Diyotta India
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake
 
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelMoving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
 
Hitachi white-paper-storage-virtualization
Hitachi white-paper-storage-virtualizationHitachi white-paper-storage-virtualization
Hitachi white-paper-storage-virtualization
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
 
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecture
 
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
 
Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group
 
Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics
 
451 Research Impact Report
451 Research Impact Report451 Research Impact Report
451 Research Impact Report
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
 

Similar to DWO using Hadoop Optimizes Data Warehouse Performance

Appfluent and Cloudera Solution Brief
Appfluent and Cloudera Solution BriefAppfluent and Cloudera Solution Brief
Appfluent and Cloudera Solution BriefAppfluent Technology
 
MapR Data Hub White Paper V2 2014
MapR Data Hub White Paper V2 2014MapR Data Hub White Paper V2 2014
MapR Data Hub White Paper V2 2014Erni Susanti
 
How pig and hadoop fit in data processing architecture
How pig and hadoop fit in data processing architectureHow pig and hadoop fit in data processing architecture
How pig and hadoop fit in data processing architectureKovid Academy
 
Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperImpetus Technologies
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunitiesBigdata Meetup Kochi
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
 
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...LindaWatson19
 
Accenture-Cloud-Data-Migration-POV-Final.pdf
Accenture-Cloud-Data-Migration-POV-Final.pdfAccenture-Cloud-Data-Migration-POV-Final.pdf
Accenture-Cloud-Data-Migration-POV-Final.pdfRajvir Kaushal
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsFredReynolds2
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseCloudera, Inc.
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overviewvhrocca
 
The New Enterprise Blueprint featuring the Gartner Magic Quadrant
The New Enterprise Blueprint featuring the Gartner Magic QuadrantThe New Enterprise Blueprint featuring the Gartner Magic Quadrant
The New Enterprise Blueprint featuring the Gartner Magic QuadrantLindaWatson19
 
Traditional data word
Traditional data wordTraditional data word
Traditional data wordorcoxsm
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoptionHortonworks
 
VILT - Archiving and Decommissioning with OpenText InfoArchive
VILT - Archiving and Decommissioning with OpenText InfoArchiveVILT - Archiving and Decommissioning with OpenText InfoArchive
VILT - Archiving and Decommissioning with OpenText InfoArchiveVILT
 
MetaScale Case Study: Hadoop Extends DataStage ETL Capacity
MetaScale Case Study: Hadoop Extends DataStage ETL CapacityMetaScale Case Study: Hadoop Extends DataStage ETL Capacity
MetaScale Case Study: Hadoop Extends DataStage ETL CapacityMetaScale
 
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionCisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionAppfluent Technology
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoopDr. Wilfred Lin (Ph.D.)
 

Similar to DWO using Hadoop Optimizes Data Warehouse Performance (20)

Appfluent and Cloudera Solution Brief
Appfluent and Cloudera Solution BriefAppfluent and Cloudera Solution Brief
Appfluent and Cloudera Solution Brief
 
MapR Data Hub White Paper V2 2014
MapR Data Hub White Paper V2 2014MapR Data Hub White Paper V2 2014
MapR Data Hub White Paper V2 2014
 
Benefits of a data lake
Benefits of a data lake Benefits of a data lake
Benefits of a data lake
 
How pig and hadoop fit in data processing architecture
How pig and hadoop fit in data processing architectureHow pig and hadoop fit in data processing architecture
How pig and hadoop fit in data processing architecture
 
Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White Paper
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunities
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...
 
Accenture-Cloud-Data-Migration-POV-Final.pdf
Accenture-Cloud-Data-Migration-POV-Final.pdfAccenture-Cloud-Data-Migration-POV-Final.pdf
Accenture-Cloud-Data-Migration-POV-Final.pdf
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data Warehouse
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overview
 
The New Enterprise Blueprint featuring the Gartner Magic Quadrant
The New Enterprise Blueprint featuring the Gartner Magic QuadrantThe New Enterprise Blueprint featuring the Gartner Magic Quadrant
The New Enterprise Blueprint featuring the Gartner Magic Quadrant
 
Traditional data word
Traditional data wordTraditional data word
Traditional data word
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
VILT - Archiving and Decommissioning with OpenText InfoArchive
VILT - Archiving and Decommissioning with OpenText InfoArchiveVILT - Archiving and Decommissioning with OpenText InfoArchive
VILT - Archiving and Decommissioning with OpenText InfoArchive
 
Big data rmoug
Big data rmougBig data rmoug
Big data rmoug
 
MetaScale Case Study: Hadoop Extends DataStage ETL Capacity
MetaScale Case Study: Hadoop Extends DataStage ETL CapacityMetaScale Case Study: Hadoop Extends DataStage ETL Capacity
MetaScale Case Study: Hadoop Extends DataStage ETL Capacity
 
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionCisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop
 

Recently uploaded

8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCRashishs7044
 
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… AbridgedLean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… AbridgedKaiNexus
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menzaictsugar
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024christinemoorman
 
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?Olivia Kresic
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailAriel592675
 
India Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportIndia Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportMintel Group
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...lizamodels9
 
Organizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessOrganizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessSeta Wicaksana
 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfJos Voskuil
 
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Seta Wicaksana
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMintel Group
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesKeppelCorporation
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfpollardmorgan
 
Marketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent ChirchirMarketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent Chirchirictsugar
 
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptxContemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptxMarkAnthonyAurellano
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCRashishs7044
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCRashishs7044
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...ictsugar
 
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdfNewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdfKhaled Al Awadi
 

Recently uploaded (20)

8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
 
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… AbridgedLean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024
 
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detail
 
India Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportIndia Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample Report
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
 
Organizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessOrganizational Structure Running A Successful Business
Organizational Structure Running A Successful Business
 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdf
 
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 Edition
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
 
Marketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent ChirchirMarketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent Chirchir
 
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptxContemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
 
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdfNewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
 

DWO using Hadoop Optimizes Data Warehouse Performance

  • 1. Data Warehouse Optimization using Hadoop Today’s escalating data volumes can prevent Online Transaction Processing (OLTP) systems from generating and processing transactions efficiently which can cause Data Warehouse (DW) performance issues when querying data. Total Cost of Ownership (TCO), too, escalates rapidly because of the need for upgrades to DW hardware and for additional licenses since organizations often pay more to store information simply because they might need it. Radically new capabilities are needed to enable DW and OLTP to function cost- effectively in this new environment. First, organizations need to cope with data volumes that traditional DW platforms (whether RDBMSs1 or appliances) were never designed for. Second, they must deal with unstructured, semi-structured and structured data. Big data technologies such as Apache Hadoop excel at managing large volumes of unstructured data. However, this data needs to be pulled together with structured data for analytical purposes. As Big Data technologies and Apache Hadoop are coming into mainstream use, being able to integrate these new technologies with existing legacy Data Warehouse platforms to get the best of both worlds is key. A new solution from Capgemini, implemented in partnership with Informatica, Cloudera and Appfluent, optimizes the ratio between the value of data and storage costs, making it easy to take advantage of new big data technologies. 1 Relational Database Management System
  • 2. Our solution: seamlessly combine the DW with big data technologies In partnership with Informatica, Cloudera and Appfluent, Capgemini has developed an integrated solution that allows OLTP systems and DWs to serve their primary functions efficiently and cost-effectively. Data Warehouse Optimization (DWO) using Hadoop incorporates Cloudera’s highly available massively-parallel processing Hadoop Platform, with minimal effort to integrate it in the existing landscape. Appfluent Visibility is used to identify what data needs to be archived and ETL/ELT2 processes should be offloaded. Informatica Information Lifecycle Management (ILM) offloads onto Hadoop the data and processing that is not needed for day-to-day functions from OLTP and DW systems. When required, archived data is made available to the primary systems in various ways. Front-line services can seamlessly retrieve historical information or use Informatica ILM to restore archived data to production applications. Elements of our solution DWO using Hadoop consists of elements from Informatica, Appfluent and Cloudera; integrated into a single solution delivered by Capgemini. Cloudera provides a complete, tested and widely deployed open source distribution of Apache Hadoop, making it available for mainstream adoption.3 Informatica Big Data Edition helps clients to start using Hadoop quickly to manage and analyze unstructured and semi-structured data. Users are shielded from complexity so the need for Hadoop-specific skills is limited. Informatica BigData edition to create ETL/ELT (including complex transformation, DQ rules, profiling, parsing and matching) framework and push all heavy lifting ETL/ELT processing to Hadoop environment. Informatica ILM Archive and Informatica Data Services (IDS) together help to allocate data optimally between the DW and Hadoop, and then make it available to users in a rapid and seamless manner, wherever it is stored. Appfluent Visibility software applies statistical analysis to usage and activity metrics in order to identify dormant data that is a candidate for archiving by Informatica ILM Archive. Hadoop data is held in a highly compressed form, avoiding the need for a regular compression process. IDS build virtualized data objects that combine data from Hadoop archives and from DWs residing on RDBMSs or appliances. You can get an instant response to your immediate data needs, regardless where the data is stored. 2 ETL: Extract, Transform & Load; ELT: Extract, Load & Transform 3 Please see our brochure Capgemini’s Data Optimization for the Enterprise with Cloudera for more details. http://www.capgemini.com/resources/capgeminis-data-optimization-for-the-enterprise-with- cloudera DWO using Hadoop incorporates elements from Informatica, Appfluent and Cloudera; integrated into a single solution delivered by Capgemini.
  • 3. the way we do itData Warehouse Capgemini expertly delivers DWO using Hadoop – you decide how As a market leader in big data and business information management, we have the experts to guide you plus the techniques to ensure a successful transformation into a high-performing, information-centric business. We have already helped many clients evolve their legacy information architectures to exploit massive data volumes and new data types. Below are a few examples of the ways we have helped clients get started. Strategic Value Assessment (SVA) In a few weeks, Capgemini can help clients build a phased plan for implementing ILM on Hadoop while identifying key opportunities to optimize the information ecosystem across OLTP, DW and other information assets within the organization. The SVA can also help build a business value proposition and budget to help obtain approval and funding. The phased plan often includes an initial proof of concept (POC) or prototyping phase to help the client understand the approach’s effectiveness and test the business case with minimal risk. Figure 1: Seamlessly combine the Data Warehouse with big data technologies Data Sources Data Warehouse Layer Semantic Layer Data Services Life Cycle Management ETL/ELT ET/ELTL DataArchive Business Intelligence Layer Data WarehouseDB Files Enterprise Data Hub EDH Big Data Edition Man aged Op en Se cure Gove rned  Informatica ILM Archive to archive data on Hadoop with compression  Informatica Data Services to build virtualized data objects combining data from Teradata & Hadoop  Informatica BigData edition to execute data integration transformations, data quality rules, profiling, parsing, and matching all natively on Hadoop  Appfluent Visibility to identify dormant data to be archived and ETL/ELT processes to be offloaded by monitoring data usage and analyzing activities  Cloudera’s complete, tested and widely deployed open source distribution of Apache Hadoop, makes it available for mainstream adoption Profile Parse Cleanse Match ETL
  • 4. The information contained in this document is proprietary. ©2014 Capgemini. All rights reserved. Rightshore® is a trademark belonging to Capgemini. About Capgemini With more than 130,000 people in over 40 countries, Capgemini is one of the world’s foremost providers of consulting, technology and outsourcing services. The Group reported 2013 global revenues of EUR 10.1 billion. Together with its clients, Capgemini creates and delivers business and technology solutions that fit their needs and drive the results they want. A deeply multicultural organization, Capgemini has developed its own way of working, the Collaborative Business ExperienceTM , and draws on Rightshore® , its worldwide delivery model. For further information visit www.capgemini.com/bim or contact us at bim@capgemini.com MCOS_GI_AH_20140506 POC or prototype We recommend choosing a low-complexity, high-value POC that can be completed quickly. The objective is to create a working model of one representative case. We leverage our relationship with Informatica and Cloudera to conduct POCs at an optimized cost, and can bundle the total cost into one transaction to avoid the complexity of dealing with multiple vendors. Managed implementation We can deliver your DWO solution out of our Big Data Service Center: a comprehensive framework leveraging our Rightshore® approach, and bringing together big data strategy governance and organization with an industrialized, high- performance delivery and support capability. Benefits of DWO using Hadoop DWO using Hadoop has several key advantages, starting with improved OLTP and DW performance. Other benefits include: TCO reduction: DW upgrade costs are avoided and license costs for existing DWs are reduced because less data needs to be stored there. Commodity hardware and software is used for archived data, lowering infrastructure costs – and archives can be compressed by up to 90%. Better decision support: All the information you need to make decisions is readily available, together with a full range of analytics. Intelligent archiving combined with virtualization gives optimum performance. Unstructured and structured data can be combined for inclusion in any report or dashboard. Historical data comes to the forefront of analysis: Respond fast to government and regulatory requests for data. Increased return on existing investment: You build on the technology you already have, rather than replacing it - your DW becomes future-proof, with infrastructure that scales to support the required volumes. Existing ETL skills can be applied to new technologies. BI flexibility and stability: A single abstract layer supports any future BI visualization tools and makes it easy to add information in future. There is no change to the business definitions and programming logic of the existing BI structure. Better data security and governance: Access to data is tightly controlled via rules applied at the semantic layer, and is auditable at a detailed level. Archive files are available only to authorized users. Security and retention policies can be defined across the data. Find out more Contact us to learn more about how we can provide Data Warehouse Optimization using Hadoop for your organization to improve OLTP and DW performance. About Informatica Informatica Corporation (Nasdaq:INFA) is the world’s number one independent provider of data integration software. Organizations around the world rely on Informatica to realize their information potential and drive top business imperatives. Informatica Vibe, the industry’s first and only embeddable virtual data machine (VDM), powers the unique “Map Once. Deploy Anywhere.” capabilities of the Informatica Platform. Worldwide, over 5,000 enterprises depend on Informatica to fully leverage their information assets from devices to mobile to social to big data residing on premise, in the Cloud and across social networks. For more information, visit www.informatica.com