SlideShare a Scribd company logo
1 of 6
Download to read offline
It is now a well-documented realization among Fortune 500 companies and
high-tech start-ups that Big Data analytics can transform the enterprise, and
organizations which lead the way will drive the most value.
But where does that value come from and how is it sustained? Is it just from
the data itself? No.
The real value of Big Data does not come from the data in its raw form, but
from its analysis - the insights derived, the products created, and the services
that emerge.
Big Data allows for dramatic shifts in enterprise level decision making and
product/ service innovation, but in order to reap the real rewards of it,
organizations must keep pace at every level, from management approaches to
technology and infrastructure.
As your business increasingly demands more and more from your data,
chances are strong that your existing data warehouse is also near capacity. In
fact, according to Gartner, 70% of all data warehouses are straining the limits
of their capacity and performance levels. If this is true for you, it is time to
modernize your data warehouse environment.
This paper addresses the need to modernize today’s data warehouse
environment and outlines best practices and approaches.
White Paper
Guide to Modernize Your
Enterprise Data Warehouse
Motivation for Modernization
Enterprise data warehouses were originally created for exploration and
analysis, but with the arrival of Big Data, they have frequently become archival
data repositories. And, what’s worse is that for many organizations getting data
into them requires expensive, time-consuming ETL extraction, transformation
and loading work.
Does This Sound Like You?
How to Migrate to a Hadoop-based Big Data Lake
Make a Move to Modern Data Architecture
The standard analytics environment at the majority of enterprise-level
companies includes the operational systems that serve as the sources for
data; a data warehouse or group of associated data marts which house and
sometimes integrate the data for a range of analysis functions; and a set of
business intelligence and analytics tools that enable insight discovery and
decision making from the use of queries, visualization, dashboards, and data
mining.
Most big companies have invested millions of dollars in their analytics
ecosystems. This includes hardware platforms, database systems, ETL
software, analytics tools and BI dashboards middleware, as well as storage
systems, all with their attendant maintenance contracts and software
upgrades.
Ideally, these environments have given enterprises the power to understand
their customers and, as a result, also helped them streamline their business
and even optimize their product and enhance their brands.
However, in the worst case scenario, current data warehouse infrastructure is
not able to affordably scale to deliver on the full promise and value of Big Data.
Enterprises today have data warehouse modernization programs in place to
find a way to combine the best of their legacy data warehouse with the new
power of Big Data technology to create a best-of-both-worlds environment.
Impetus can help
Our experienced team of experts deliver a repeatable methodology to provide
a customizable range of services including assessment and planning,
implementation and data quality validation to support their data warehouse
modernization programs.
If you need to modernize your data architecture, your foundation will no doubt
begin with Hadoop. It is as much a must-have as it is a game-changer from an
IT and a business perspective.
Hadoop is a cost-effective, scale-out storage system with parallel computing
and analytical capability. It simplifies the procurement and storage of diverse
data sources, whether structured, semi-structured (e.g., sensor feeds,
machine data), or unstructured (e.g., web logs, social media, image, video,
audio).
It has become the framework of choice to accelerate time-to-insight, and
reduce the overall costs of managing data. Hadoop will play a positive and
profound role on your long-term data storage, management and analysis
capabilities, and in realizing the critical value of your data to sustain
competitiveness.
While Hadoop ecosystem offers powerful capabilities and virtually unlimited
horizontal scalability, it does not provide the complete set of functionality you
need for enterprise-level, Big Data analysis.
Implementing the Data Lake
With large teams of engineers and analysts, these gaps must be filled through
complex manual coding and large support teams. This slows Hadoop
adoption and can frustrate management teams who are eager to derive and
deliver results.
Impetus can help
Impetus offers a comprehensive end-to-end Data Warehouse Workload
Migration (WM) solution that allows you to identify and safely migrate data,
perform ETL processing and enable large scale analytics from the enterprise
data warehouse (EDW) to a Hadoop-based Big Data warehouse.
Furthermore, WM not just seamlessly moves schema, data, views etc. but
also transforms Procedure Language Scripts and migrates complete Role
Based Access Control (RBAC) and reports. This ensures that you reap
modern big data warehousing benefits along with protecting and using your
investments on existing traditional RDBMS and other information
infrastructure.
Adopting Hadoop involves introducing a Data Lake into your analytics
ecosystem.
The Data Lake can serve as your organization’s central data repository. What
makes the Data Lake a unique and differentiated repository framework is its
ability to unify and connect your data. It helps you access your entire body of
data simultaneously, unleashing the true power of Big Data — a co-related
and collaborative output of superior insights and analysis. It presents you with
a dynamic scenario where one can dictate a variety of need-based analysis
made possible by this unstructured repository.
While there are many purposes it can serve, such as feeding both your
production and sandbox environments, the first step and most immediate
opportunity is often the off-loading of the ETL (extract, transform, and load)
routines from the traditional data warehouse.
Building a robust Data Lake is a gradual movement. With the right tools, a
clearly-planned platform, a strong and uniform vision that includes innovation
around advanced analytics, your organization can architect an integrated,
rationalized and rigorous Data Lake repository.
Impetus can help
We specialize in modernizing the data warehouse and implementing data
lakes. We have experience with every stage of the Big Data transformation
curve. We enable you to:
• Work with unstructured data.
• Facilitate democratized data access.
• Apply Machine Learning algorithms to enrich data quality.
• Contain costs while continuing to do more with the data.
• Ensure that you do not end up in a data swamp.
Four Steps to Building a Data Lake
Challenges in Migrating to the Data Lake
Step 1: Acquire & Transform Data at Scale
This first stage involves putting the architecture together and learning to
handle and ingest data at scale. At this stage, the analytics consist of simple
transformations; however, it’s an important step in discovering how to make
Hadoop work for your organization.
Step 2: Focus on Analysis
Now you’re ready to focus on enhancing data analysis and interpretation. To
fully leverage the Data Lake, you will need to use various tools and
frameworks to begin combining and integrating the EDW and the Data Lake.
Step 3: Collaborate
This is where you will start to witness a seamless synergy between the EDW
and the Hadoop-based Data Lake. The strengths of each architecture will
begin to make themselves visible in your organization as this porous, all-
encompassing data pool allows analytics and intelligence to flow freely
across your enterprise.
Step 4: Unify
In this last stage, you reach maturity, tying together enterprise capabilities
and large-scale unification from information governance, compliance,
security, and auditing to the management of metadata and information
lifecycle capabilities.
Impetus can help
Workload Migration includes an auto-recommendation engine that helps in
intelligent migration by suggesting various recommendations around offload-
able parameters and metrics. This helps in optimizing the schema, synergize
and effectively form the data lake. Right from clustering, partitioning to
splitting the schema and data to recommendations on offload-able tables,
queries, optimization parameters, query engine and other capabilities.
Setting up a Hadoop-based Data Lake can be challenging for organizations
who do not have experience migrating Big Data. Organizations often
encounter some of the following challenges:
•
Identifying which data sources to offload•
Data validation and quality checks
• Issues with SQL compatibility
• Lack of available user defined functions in Hadoop libraries
• Lack of procedural support
• Workflows locked in proprietary data integraton tools
• The high costs and effort of migration
• Exception handling
• Lack of unified view and dashboard to offload data
• Governance controls on migration system and data
Ingests data rapidly via our fast, fault tolerant, parallel data ingestion
component.
Transforms SQL and procedural SQL from RDBMS, MPP, and other
database to compatible HQL and Spark QL queries. Using our
foundational, intelligent transformation engine.
Provides a smart User Interface that allows you to effortlessly orchestrate
migration pipelines in just a few clicks.
Integrates with your firm’s LDAP to allow Single Sign-on capabilities for
your users.
Delivers rapid response times and performance you can count on through
our integrated cache.
Tracks all metadata in source and target data stores.
Provides strict governance controls including access, roles, and security
that can be built into the migration process to keep your data safe.
Caters to a multitude of data sources to bring data in seamlessly and
safely after data validation and quality checks.
Runs checks and balance on data migration using our library of data
quality and data validation algorithms available as operators.
Offload Teradata, SQL server and DB2 Views easily
Executes migration pipelines, monitors them for various metrics and
health checks and helps the admin to stop or resume any pipeline at any
point using our job processing engine.
Deploys and monitors components in real-time using our automated
cluster management and monitoring utility.
Shows comprehensive stage-wise reports for migration, transformation,
registration and execution.
Intelligent Migration: Assess workloads automatically that includes
recommendations on a number of parameters for offloading.
Provides seamless connectivity from BI tools like Tableau, Qlikview, etc.,
allowing you to easily run Teradata or Oracle reports while migrating your
data.
The Impetus Data Warehouse Workload
Migration Tool
What it does
The Impetus Data Warehouse Workload Migration tool does the following:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
prf_oralce_Ds
Migration
Validation
Execution
Impetus can help
Impetus Workload Migration provides an automated migration toolset
consisting of utilities that our team of experts or your in-house staff can use
to automate the migration and conversion of data for execution in the
Hadoop environment.
It also allows you to run data quality functions to standardize, cleanse and
de-dupe data. You can re-upload the processed data back to the source
EDW for reporting purposes if required.
We provide pre-built conversion logic for Teradata, Netezza, Oracle,
Microsoft SQL Server and IBM DB2 source data stores. Additionally,
Workload Migration includes a library of advanced data science machine
learning algorithms for solving difficult data quality challenges.
© 2016 Impetus Technologies, Inc.
All rights reserved. Product and
company names mentioned herein
may be trademarks of their
respective companies.
Feb 2016
Impetus is focused on creating big business impact through Big Data Solutions for Fortune 1000
enterprises across multiple verticals. The company brings together a unique mix of software products,
consulting services, Data Science capabilities and technology expertise. It offers full life-cycle services
for Big Data implementations and real-time streaming analytics, including technology strategy,
solution architecture, proof of concept, production implementation and on-going support to its clients.
To learn more, visit www.impetus.com or write to us at inquiry@impetus.com.
How it helps you
Ready to Modernize?
•
•
•
•
•
•
•
•
Accelerate offloading time
Save 50%-80% of labor costs compared to manual offloading
Automated Assessment and Expert Recommendations for Offloading
Business-critical Data
Minimize data quality risk using our full library of data validation and
quality checks as well as our advanced monitoring and metrics
mechanisms.
Optimize performance with advanced features for partitioning and
clustering features.
Accelerate parallel and SQL processing using Hadoop along with
streaming ETL options
Maximize existing SQL and stored procedure investments and reuse of
tools
Reduce Hadoop migration project risks through the use of proven best
practices and automated quality assurance checks for data and logic
The Impetus Data Warehouse Workload Migration tool makes migrating to a
modern warehouse architecture a goal within reach, easily, skillfully, and
rapidly. Our proven tools and methodologies and our experienced team of Big
Data experts can help you do the following:
Impetus can help
To learn more about our workload migration solution or how we can help you
on your data warehouse modernization journey, visit www.impetus.com or
write to us at bigdata@impetus.com.

More Related Content

What's hot

Infrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical WorkloadsInfrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical WorkloadsCognizant
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKRajesh Jayarman
 
bigdatasqloverview21jan2015-2408000
bigdatasqloverview21jan2015-2408000bigdatasqloverview21jan2015-2408000
bigdatasqloverview21jan2015-2408000Kartik Padmanabhan
 
Hadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality ChallengeHadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality ChallengeInside Analysis
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data SolutionsMark Kromer
 
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview PresentationFilling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview PresentationPentaho
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data LakeMetroStar
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseCloudera, Inc.
 
MapR Data Hub White Paper V2 2014
MapR Data Hub White Paper V2 2014MapR Data Hub White Paper V2 2014
MapR Data Hub White Paper V2 2014Erni Susanti
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digitalsambiswal
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHortonworks
 
Microsoft SQL Azure - Scaling Out with SQL Azure Whitepaper
Microsoft SQL Azure - Scaling Out with SQL Azure WhitepaperMicrosoft SQL Azure - Scaling Out with SQL Azure Whitepaper
Microsoft SQL Azure - Scaling Out with SQL Azure WhitepaperMicrosoft Private Cloud
 
Introduction to Microsoft’s Master Data Services (MDS)
Introduction to Microsoft’s Master Data Services (MDS)Introduction to Microsoft’s Master Data Services (MDS)
Introduction to Microsoft’s Master Data Services (MDS)James Serra
 
Capgemini Data Warehouse Optimization Using Hadoop
Capgemini Data Warehouse Optimization Using HadoopCapgemini Data Warehouse Optimization Using Hadoop
Capgemini Data Warehouse Optimization Using HadoopAppfluent Technology
 
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Data Con LA
 
Rajesh Angadi Brochure
Rajesh Angadi Brochure Rajesh Angadi Brochure
Rajesh Angadi Brochure Rajesh Angadi
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata Hortonworks
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksHortonworks
 
Appfluent and Cloudera Solution Brief
Appfluent and Cloudera Solution BriefAppfluent and Cloudera Solution Brief
Appfluent and Cloudera Solution BriefAppfluent Technology
 

What's hot (20)

Infrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical WorkloadsInfrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical Workloads
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
bigdatasqloverview21jan2015-2408000
bigdatasqloverview21jan2015-2408000bigdatasqloverview21jan2015-2408000
bigdatasqloverview21jan2015-2408000
 
Hadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality ChallengeHadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality Challenge
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data Solutions
 
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview PresentationFilling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
Data Lake,beyond the Data Warehouse
Data Lake,beyond the Data WarehouseData Lake,beyond the Data Warehouse
Data Lake,beyond the Data Warehouse
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data Warehouse
 
MapR Data Hub White Paper V2 2014
MapR Data Hub White Paper V2 2014MapR Data Hub White Paper V2 2014
MapR Data Hub White Paper V2 2014
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
 
Microsoft SQL Azure - Scaling Out with SQL Azure Whitepaper
Microsoft SQL Azure - Scaling Out with SQL Azure WhitepaperMicrosoft SQL Azure - Scaling Out with SQL Azure Whitepaper
Microsoft SQL Azure - Scaling Out with SQL Azure Whitepaper
 
Introduction to Microsoft’s Master Data Services (MDS)
Introduction to Microsoft’s Master Data Services (MDS)Introduction to Microsoft’s Master Data Services (MDS)
Introduction to Microsoft’s Master Data Services (MDS)
 
Capgemini Data Warehouse Optimization Using Hadoop
Capgemini Data Warehouse Optimization Using HadoopCapgemini Data Warehouse Optimization Using Hadoop
Capgemini Data Warehouse Optimization Using Hadoop
 
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
 
Rajesh Angadi Brochure
Rajesh Angadi Brochure Rajesh Angadi Brochure
Rajesh Angadi Brochure
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
 
Appfluent and Cloudera Solution Brief
Appfluent and Cloudera Solution BriefAppfluent and Cloudera Solution Brief
Appfluent and Cloudera Solution Brief
 

Viewers also liked

Tungsten University: Load A Vertica Data Warehouse With MySQL Data
Tungsten University: Load A Vertica Data Warehouse With MySQL DataTungsten University: Load A Vertica Data Warehouse With MySQL Data
Tungsten University: Load A Vertica Data Warehouse With MySQL DataContinuent
 
Réussir sa newsletter avec Mailchimp
Réussir sa newsletter avec MailchimpRéussir sa newsletter avec Mailchimp
Réussir sa newsletter avec MailchimpCaroline Françoise
 
sCommerce - Mitos y realidades
sCommerce - Mitos y realidades sCommerce - Mitos y realidades
sCommerce - Mitos y realidades Joe Shevy
 
Mspp formation-managing-successfull-programmes-msp-practitioner
Mspp formation-managing-successfull-programmes-msp-practitionerMspp formation-managing-successfull-programmes-msp-practitioner
Mspp formation-managing-successfull-programmes-msp-practitionerCERTyou Formation
 
фонди соціального страхування як ланка фінансової системи
фонди соціального страхування як ланка фінансової системифонди соціального страхування як ланка фінансової системи
фонди соціального страхування як ланка фінансової системиДіана Власік
 
Ways of knowing, ways of seeing: experiences of visual methodologies in Joha...
Ways of knowing, ways of seeing:  experiences of visual methodologies in Joha...Ways of knowing, ways of seeing:  experiences of visual methodologies in Joha...
Ways of knowing, ways of seeing: experiences of visual methodologies in Joha...Jo Vearey
 
Gestion de programme
Gestion de programmeGestion de programme
Gestion de programmeggodbout
 
Urban – Rural Linkages: the role of an interlinked livelihood system and impl...
Urban – Rural Linkages:the role of an interlinked livelihood system and impl...Urban – Rural Linkages:the role of an interlinked livelihood system and impl...
Urban – Rural Linkages: the role of an interlinked livelihood system and impl...Jo Vearey
 
LYCRA® F!TFinder - A Solution to Find the Optimal Size for Hosiery - Richard ...
LYCRA® F!TFinder - A Solution to Find the Optimal Size for Hosiery - Richard ...LYCRA® F!TFinder - A Solution to Find the Optimal Size for Hosiery - Richard ...
LYCRA® F!TFinder - A Solution to Find the Optimal Size for Hosiery - Richard ...LYCRAbrand
 
How does a gender lens contribute to a healthy environment and healthy people...
How does a gender lens contribute to a healthy environment and healthy people...How does a gender lens contribute to a healthy environment and healthy people...
How does a gender lens contribute to a healthy environment and healthy people...Jo Vearey
 
Key findings of research studies on migrants’ access to health in South Afric...
Key findings of research studies on migrants’ access to health in South Afric...Key findings of research studies on migrants’ access to health in South Afric...
Key findings of research studies on migrants’ access to health in South Afric...Jo Vearey
 
Projet vs Programme vs Portefeuille
Projet vs Programme vs PortefeuilleProjet vs Programme vs Portefeuille
Projet vs Programme vs PortefeuilleRadenko Corovic
 
Shaping a Technology Strategy for Mobile Development
Shaping a Technology Strategy for Mobile DevelopmentShaping a Technology Strategy for Mobile Development
Shaping a Technology Strategy for Mobile Developmentfrog
 
Celine Collaboration Analysis
Celine Collaboration AnalysisCeline Collaboration Analysis
Celine Collaboration AnalysisSharna Aquilina
 
Anti vegf in atypical indications (2)
Anti vegf in atypical indications (2)Anti vegf in atypical indications (2)
Anti vegf in atypical indications (2)Roche
 
International monetary system
International monetary systemInternational monetary system
International monetary systemGAURAV SHARMA
 

Viewers also liked (20)

Tungsten University: Load A Vertica Data Warehouse With MySQL Data
Tungsten University: Load A Vertica Data Warehouse With MySQL DataTungsten University: Load A Vertica Data Warehouse With MySQL Data
Tungsten University: Load A Vertica Data Warehouse With MySQL Data
 
Réussir sa newsletter avec Mailchimp
Réussir sa newsletter avec MailchimpRéussir sa newsletter avec Mailchimp
Réussir sa newsletter avec Mailchimp
 
sCommerce - Mitos y realidades
sCommerce - Mitos y realidades sCommerce - Mitos y realidades
sCommerce - Mitos y realidades
 
Mspp formation-managing-successfull-programmes-msp-practitioner
Mspp formation-managing-successfull-programmes-msp-practitionerMspp formation-managing-successfull-programmes-msp-practitioner
Mspp formation-managing-successfull-programmes-msp-practitioner
 
фонди соціального страхування як ланка фінансової системи
фонди соціального страхування як ланка фінансової системифонди соціального страхування як ланка фінансової системи
фонди соціального страхування як ланка фінансової системи
 
Ways of knowing, ways of seeing: experiences of visual methodologies in Joha...
Ways of knowing, ways of seeing:  experiences of visual methodologies in Joha...Ways of knowing, ways of seeing:  experiences of visual methodologies in Joha...
Ways of knowing, ways of seeing: experiences of visual methodologies in Joha...
 
Yoga
YogaYoga
Yoga
 
Gestion de programme
Gestion de programmeGestion de programme
Gestion de programme
 
Urban – Rural Linkages: the role of an interlinked livelihood system and impl...
Urban – Rural Linkages:the role of an interlinked livelihood system and impl...Urban – Rural Linkages:the role of an interlinked livelihood system and impl...
Urban – Rural Linkages: the role of an interlinked livelihood system and impl...
 
LYCRA® F!TFinder - A Solution to Find the Optimal Size for Hosiery - Richard ...
LYCRA® F!TFinder - A Solution to Find the Optimal Size for Hosiery - Richard ...LYCRA® F!TFinder - A Solution to Find the Optimal Size for Hosiery - Richard ...
LYCRA® F!TFinder - A Solution to Find the Optimal Size for Hosiery - Richard ...
 
How does a gender lens contribute to a healthy environment and healthy people...
How does a gender lens contribute to a healthy environment and healthy people...How does a gender lens contribute to a healthy environment and healthy people...
How does a gender lens contribute to a healthy environment and healthy people...
 
Key findings of research studies on migrants’ access to health in South Afric...
Key findings of research studies on migrants’ access to health in South Afric...Key findings of research studies on migrants’ access to health in South Afric...
Key findings of research studies on migrants’ access to health in South Afric...
 
Projet vs Programme vs Portefeuille
Projet vs Programme vs PortefeuilleProjet vs Programme vs Portefeuille
Projet vs Programme vs Portefeuille
 
Shaping a Technology Strategy for Mobile Development
Shaping a Technology Strategy for Mobile DevelopmentShaping a Technology Strategy for Mobile Development
Shaping a Technology Strategy for Mobile Development
 
Celine Collaboration Analysis
Celine Collaboration AnalysisCeline Collaboration Analysis
Celine Collaboration Analysis
 
Posterscope: 2017 Predictions
Posterscope: 2017 PredictionsPosterscope: 2017 Predictions
Posterscope: 2017 Predictions
 
Anti vegf in atypical indications (2)
Anti vegf in atypical indications (2)Anti vegf in atypical indications (2)
Anti vegf in atypical indications (2)
 
Mandalas 3º
Mandalas 3ºMandalas 3º
Mandalas 3º
 
International monetary system
International monetary systemInternational monetary system
International monetary system
 
Prince2 Methodology
Prince2 MethodologyPrince2 Methodology
Prince2 Methodology
 

Similar to WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts

Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperImpetus Technologies
 
Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869Edgar Alejandro Villegas
 
Enterprise Data Lake
Enterprise Data LakeEnterprise Data Lake
Enterprise Data Lakesambiswal
 
LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchSheetal Pratik
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overviewvhrocca
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data AnalyticsAttunity
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsFredReynolds2
 
Workload_Migration
Workload_MigrationWorkload_Migration
Workload_MigrationAditya Singh
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)Moacyr Passador
 
Simplified Workload Migration to Big Data Warehouse
Simplified Workload Migration to Big Data WarehouseSimplified Workload Migration to Big Data Warehouse
Simplified Workload Migration to Big Data WarehouseAtul Sharma
 
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy:  A Simple, Scalable Solution for Getting Started with HadoopBig Data Made Easy:  A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with HadoopPrecisely
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundationshktripathy
 
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionCisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionAppfluent Technology
 
Hadoop data-lake-white-paper
Hadoop data-lake-white-paperHadoop data-lake-white-paper
Hadoop data-lake-white-paperSupratim Ray
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouseStephen Alex
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouseStephen Alex
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunitiesBigdata Meetup Kochi
 
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeEvolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeSG Analytics
 

Similar to WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts (20)

Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White Paper
 
Benefits of a data lake
Benefits of a data lake Benefits of a data lake
Benefits of a data lake
 
Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869
 
Enterprise Data Lake
Enterprise Data LakeEnterprise Data Lake
Enterprise Data Lake
 
LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbench
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overview
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
Workload_Migration
Workload_MigrationWorkload_Migration
Workload_Migration
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
 
Simplified Workload Migration to Big Data Warehouse
Simplified Workload Migration to Big Data WarehouseSimplified Workload Migration to Big Data Warehouse
Simplified Workload Migration to Big Data Warehouse
 
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy:  A Simple, Scalable Solution for Getting Started with HadoopBig Data Made Easy:  A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundations
 
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionCisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
 
Hadoop data-lake-white-paper
Hadoop data-lake-white-paperHadoop data-lake-white-paper
Hadoop data-lake-white-paper
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunities
 
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeEvolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
 
Big data rmoug
Big data rmougBig data rmoug
Big data rmoug
 

WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts

  • 1. It is now a well-documented realization among Fortune 500 companies and high-tech start-ups that Big Data analytics can transform the enterprise, and organizations which lead the way will drive the most value. But where does that value come from and how is it sustained? Is it just from the data itself? No. The real value of Big Data does not come from the data in its raw form, but from its analysis - the insights derived, the products created, and the services that emerge. Big Data allows for dramatic shifts in enterprise level decision making and product/ service innovation, but in order to reap the real rewards of it, organizations must keep pace at every level, from management approaches to technology and infrastructure. As your business increasingly demands more and more from your data, chances are strong that your existing data warehouse is also near capacity. In fact, according to Gartner, 70% of all data warehouses are straining the limits of their capacity and performance levels. If this is true for you, it is time to modernize your data warehouse environment. This paper addresses the need to modernize today’s data warehouse environment and outlines best practices and approaches. White Paper Guide to Modernize Your Enterprise Data Warehouse Motivation for Modernization Enterprise data warehouses were originally created for exploration and analysis, but with the arrival of Big Data, they have frequently become archival data repositories. And, what’s worse is that for many organizations getting data into them requires expensive, time-consuming ETL extraction, transformation and loading work. Does This Sound Like You? How to Migrate to a Hadoop-based Big Data Lake
  • 2. Make a Move to Modern Data Architecture The standard analytics environment at the majority of enterprise-level companies includes the operational systems that serve as the sources for data; a data warehouse or group of associated data marts which house and sometimes integrate the data for a range of analysis functions; and a set of business intelligence and analytics tools that enable insight discovery and decision making from the use of queries, visualization, dashboards, and data mining. Most big companies have invested millions of dollars in their analytics ecosystems. This includes hardware platforms, database systems, ETL software, analytics tools and BI dashboards middleware, as well as storage systems, all with their attendant maintenance contracts and software upgrades. Ideally, these environments have given enterprises the power to understand their customers and, as a result, also helped them streamline their business and even optimize their product and enhance their brands. However, in the worst case scenario, current data warehouse infrastructure is not able to affordably scale to deliver on the full promise and value of Big Data. Enterprises today have data warehouse modernization programs in place to find a way to combine the best of their legacy data warehouse with the new power of Big Data technology to create a best-of-both-worlds environment. Impetus can help Our experienced team of experts deliver a repeatable methodology to provide a customizable range of services including assessment and planning, implementation and data quality validation to support their data warehouse modernization programs. If you need to modernize your data architecture, your foundation will no doubt begin with Hadoop. It is as much a must-have as it is a game-changer from an IT and a business perspective. Hadoop is a cost-effective, scale-out storage system with parallel computing and analytical capability. It simplifies the procurement and storage of diverse data sources, whether structured, semi-structured (e.g., sensor feeds, machine data), or unstructured (e.g., web logs, social media, image, video, audio). It has become the framework of choice to accelerate time-to-insight, and reduce the overall costs of managing data. Hadoop will play a positive and profound role on your long-term data storage, management and analysis capabilities, and in realizing the critical value of your data to sustain competitiveness. While Hadoop ecosystem offers powerful capabilities and virtually unlimited horizontal scalability, it does not provide the complete set of functionality you need for enterprise-level, Big Data analysis.
  • 3. Implementing the Data Lake With large teams of engineers and analysts, these gaps must be filled through complex manual coding and large support teams. This slows Hadoop adoption and can frustrate management teams who are eager to derive and deliver results. Impetus can help Impetus offers a comprehensive end-to-end Data Warehouse Workload Migration (WM) solution that allows you to identify and safely migrate data, perform ETL processing and enable large scale analytics from the enterprise data warehouse (EDW) to a Hadoop-based Big Data warehouse. Furthermore, WM not just seamlessly moves schema, data, views etc. but also transforms Procedure Language Scripts and migrates complete Role Based Access Control (RBAC) and reports. This ensures that you reap modern big data warehousing benefits along with protecting and using your investments on existing traditional RDBMS and other information infrastructure. Adopting Hadoop involves introducing a Data Lake into your analytics ecosystem. The Data Lake can serve as your organization’s central data repository. What makes the Data Lake a unique and differentiated repository framework is its ability to unify and connect your data. It helps you access your entire body of data simultaneously, unleashing the true power of Big Data — a co-related and collaborative output of superior insights and analysis. It presents you with a dynamic scenario where one can dictate a variety of need-based analysis made possible by this unstructured repository. While there are many purposes it can serve, such as feeding both your production and sandbox environments, the first step and most immediate opportunity is often the off-loading of the ETL (extract, transform, and load) routines from the traditional data warehouse. Building a robust Data Lake is a gradual movement. With the right tools, a clearly-planned platform, a strong and uniform vision that includes innovation around advanced analytics, your organization can architect an integrated, rationalized and rigorous Data Lake repository. Impetus can help We specialize in modernizing the data warehouse and implementing data lakes. We have experience with every stage of the Big Data transformation curve. We enable you to: • Work with unstructured data. • Facilitate democratized data access. • Apply Machine Learning algorithms to enrich data quality. • Contain costs while continuing to do more with the data. • Ensure that you do not end up in a data swamp.
  • 4. Four Steps to Building a Data Lake Challenges in Migrating to the Data Lake Step 1: Acquire & Transform Data at Scale This first stage involves putting the architecture together and learning to handle and ingest data at scale. At this stage, the analytics consist of simple transformations; however, it’s an important step in discovering how to make Hadoop work for your organization. Step 2: Focus on Analysis Now you’re ready to focus on enhancing data analysis and interpretation. To fully leverage the Data Lake, you will need to use various tools and frameworks to begin combining and integrating the EDW and the Data Lake. Step 3: Collaborate This is where you will start to witness a seamless synergy between the EDW and the Hadoop-based Data Lake. The strengths of each architecture will begin to make themselves visible in your organization as this porous, all- encompassing data pool allows analytics and intelligence to flow freely across your enterprise. Step 4: Unify In this last stage, you reach maturity, tying together enterprise capabilities and large-scale unification from information governance, compliance, security, and auditing to the management of metadata and information lifecycle capabilities. Impetus can help Workload Migration includes an auto-recommendation engine that helps in intelligent migration by suggesting various recommendations around offload- able parameters and metrics. This helps in optimizing the schema, synergize and effectively form the data lake. Right from clustering, partitioning to splitting the schema and data to recommendations on offload-able tables, queries, optimization parameters, query engine and other capabilities. Setting up a Hadoop-based Data Lake can be challenging for organizations who do not have experience migrating Big Data. Organizations often encounter some of the following challenges: • Identifying which data sources to offload• Data validation and quality checks • Issues with SQL compatibility • Lack of available user defined functions in Hadoop libraries • Lack of procedural support • Workflows locked in proprietary data integraton tools • The high costs and effort of migration • Exception handling • Lack of unified view and dashboard to offload data • Governance controls on migration system and data
  • 5. Ingests data rapidly via our fast, fault tolerant, parallel data ingestion component. Transforms SQL and procedural SQL from RDBMS, MPP, and other database to compatible HQL and Spark QL queries. Using our foundational, intelligent transformation engine. Provides a smart User Interface that allows you to effortlessly orchestrate migration pipelines in just a few clicks. Integrates with your firm’s LDAP to allow Single Sign-on capabilities for your users. Delivers rapid response times and performance you can count on through our integrated cache. Tracks all metadata in source and target data stores. Provides strict governance controls including access, roles, and security that can be built into the migration process to keep your data safe. Caters to a multitude of data sources to bring data in seamlessly and safely after data validation and quality checks. Runs checks and balance on data migration using our library of data quality and data validation algorithms available as operators. Offload Teradata, SQL server and DB2 Views easily Executes migration pipelines, monitors them for various metrics and health checks and helps the admin to stop or resume any pipeline at any point using our job processing engine. Deploys and monitors components in real-time using our automated cluster management and monitoring utility. Shows comprehensive stage-wise reports for migration, transformation, registration and execution. Intelligent Migration: Assess workloads automatically that includes recommendations on a number of parameters for offloading. Provides seamless connectivity from BI tools like Tableau, Qlikview, etc., allowing you to easily run Teradata or Oracle reports while migrating your data. The Impetus Data Warehouse Workload Migration Tool What it does The Impetus Data Warehouse Workload Migration tool does the following: • • • • • • • • • • • • • • • prf_oralce_Ds Migration Validation Execution Impetus can help Impetus Workload Migration provides an automated migration toolset consisting of utilities that our team of experts or your in-house staff can use to automate the migration and conversion of data for execution in the Hadoop environment. It also allows you to run data quality functions to standardize, cleanse and de-dupe data. You can re-upload the processed data back to the source EDW for reporting purposes if required. We provide pre-built conversion logic for Teradata, Netezza, Oracle, Microsoft SQL Server and IBM DB2 source data stores. Additionally, Workload Migration includes a library of advanced data science machine learning algorithms for solving difficult data quality challenges.
  • 6. © 2016 Impetus Technologies, Inc. All rights reserved. Product and company names mentioned herein may be trademarks of their respective companies. Feb 2016 Impetus is focused on creating big business impact through Big Data Solutions for Fortune 1000 enterprises across multiple verticals. The company brings together a unique mix of software products, consulting services, Data Science capabilities and technology expertise. It offers full life-cycle services for Big Data implementations and real-time streaming analytics, including technology strategy, solution architecture, proof of concept, production implementation and on-going support to its clients. To learn more, visit www.impetus.com or write to us at inquiry@impetus.com. How it helps you Ready to Modernize? • • • • • • • • Accelerate offloading time Save 50%-80% of labor costs compared to manual offloading Automated Assessment and Expert Recommendations for Offloading Business-critical Data Minimize data quality risk using our full library of data validation and quality checks as well as our advanced monitoring and metrics mechanisms. Optimize performance with advanced features for partitioning and clustering features. Accelerate parallel and SQL processing using Hadoop along with streaming ETL options Maximize existing SQL and stored procedure investments and reuse of tools Reduce Hadoop migration project risks through the use of proven best practices and automated quality assurance checks for data and logic The Impetus Data Warehouse Workload Migration tool makes migrating to a modern warehouse architecture a goal within reach, easily, skillfully, and rapidly. Our proven tools and methodologies and our experienced team of Big Data experts can help you do the following: Impetus can help To learn more about our workload migration solution or how we can help you on your data warehouse modernization journey, visit www.impetus.com or write to us at bigdata@impetus.com.