SlideShare a Scribd company logo
c1  TDWI RESEARCH 	 tdwi.org
TDWI CHECKLIST REPORT: SEVEN BEST PRACTICES FOR ADOPTING DATA WAREHOUSE AUTOMATION
TDWI CHECKLIST REPORT
TDWI RESEARCH
tdwi.org
Seven Best Practices for
Adopting Data Warehouse
Automation
By David Loshin
Sponsored by:
1  TDWI RESEARCH 	 tdwi.org
	2	 FOREWORD
	2	 NUMBER ONE
		Analyze the cycle time for satisfying business requests
	3	 NUMBER TWO
		Devise a value model for comparing data warehousing 	
		alternatives
	3	 NUMBER THREE
		Architect a hybrid environment
	4	 NUMBER FOUR
		Be agile in ingesting new data sources
	4	 NUMBER FIVE
		Train data engineers to collaborate with business users
	5	 NUMBER SIX
		Support business self-service
	5	 NUMBER SEVEN
		Rethink the role of the BI competency center
	6	 CONSIDERATIONS
		Developing a feasible integration plan for data
		 warehouse automation
	7	 ABOUT OUR SPONSOR
	7	 ABOUT THE AUTHOR
	7	 ABOUT TDWI RESEARCH
	7	 ABOUT TDWI CHECKLIST REPORTS
© 2016 by TDWI (The Data Warehousing InstituteTM
), a division of 1105 Media, Inc. All rights
reserved. Reproductions in whole or in part are prohibited except by written permission. E-mail
requests or feedback to info@tdwi.org. Product and company names mentioned herein may be
trademarks and/or registered trademarks of their respective companies.
JANUARY 2016
SEVEN BEST PRACTICES FOR
ADOPTING DATA WAREHOUSE
AUTOMATION
By David Loshin
TABLE OF CONTENTS
555 S Renton Village Place, Ste. 700
Renton, WA 98057-3295
T	 425.277.9126
F	 425.687.2842
E	 info@tdwi.org
tdwi.org
TDWI CHECKLIST REPORT
2  TDWI RESEARCH 	 tdwi.org
TDWI CHECKLIST REPORT: SEVEN BEST PRACTICES FOR ADOPTING DATA WAREHOUSE AUTOMATION
The word agile conveys a number of meanings. Its simple definition
as an adjective is “quick and well-coordinated,” but the term has
taken on additional meaning in the context of system design and
development, largely in terms of rapid development, increased
partnership among information technology (IT) and business partners,
and leveraging teamwork to achieve short-term objectives that build
toward solving more complex business challenges.
Applying agile development methodologies to data warehousing
promises a number of benefits: quicker development and movement
into production, faster time to value, reduced start-up and overhead
costs, and simplified access for business users. Yet there are some
prerequisites to transitioning to using the agile approach for data
warehousing:
•	 Evaluate the existing environment to identify the best
opportunities for achieving the benefits of adoption
•	 	Establish good practices for leveraging more agile technologies
where possible
•	 	Identify the right technologies to facilitate that transition
There are a number of technologies that accelerate design and
development, improve cycle time in producing reports and analyses,
and enhance the IT-business collaboration. Some are platform
oriented, such as columnar databases, in-memory computing, and
Hadoop, which all seek to leverage faster performance to improve
analytical results. Alternatively, data warehouse automation
(DWA) tools blend user requirements and repeatable processes
to automatically generate the components of a data warehouse
environment. These tools simplify the end-to-end production of a
data warehouse, encompassing the entire development life cycle,
including source system analysis, design, development, generation
of data integration scripts, building, deployment, generation of
documentation, testing, support for ongoing operations, impact
analysis, and change management.
We are rapidly moving away from the monolithic, single-system
enterprise data warehouse and toward a hybrid environment that
uses the most appropriate technologies to address specific data
challenges. That environment will encompass many components and
will benefit from reduced complexity through the use of tools like DWA.
This checklist discusses seven practices for determining the value
proposition of adopting DWA and establishing the foundation that will
ease its adoption.
FOREWORD
The success of the conventional IT management approach to data
warehousing and business intelligence (BI) has opened the door for
a growing population of data warehouse consumers, both within and
outside of the organization. Although many of these consumers’ needs
are met by existing reports or by providing access for straightforward
queries, a combination of factors has created a bottleneck in rapidly
addressing business-user demands. Many business analysts are
becoming more sophisticated in their investigations, requiring
additional consulting from their IT counterparts to develop data
extracts and reports.
At the same time, though, IT budgets and staffing remain constrained.
The result is that scheduling limited IT consulting resources elongates
the time from when a business user requests a data product to the
time when that data product is delivered (“cycle time”).
There are two main risks of long cycle times. First, the data product
is delivered after the window of opportunity for taking advantage
of its results. Second, frustrated users may abandon the use of the
enterprise data warehouse and adopt their own “shadow” reporting
and analytics tools and methods, bypassing any governance
procedures intended to ensure enterprisewide consistency.
Long cycle times and development bottlenecks lead to missed
business opportunities. Therefore, increasing agility by eliminating
those bottlenecks will increase the benefits of your reporting and
analytics investment.
To find where those bottlenecks hinder productivity, analyze the end-
to-end process for satisfying BI and analysis requests. It is likely that
the source of the logjam is in the design-develop-test loop for new
reports and extracts; of course, automation tools can reduce or even
eliminate the development blockage and speed time to value.
Analyzing the report development cycle time provides a benchmark for
the time and resources required (in general) to satisfy business needs.
This benchmark can be used as the starting point for optimizing
existing processes as well as providing a metric for evaluation of how
DWA tools can speed time to value.
ANALYZE THE CYCLE TIME FOR SATISFYING
BUSINESS REQUESTS
	 NUMBER ONE
3  TDWI RESEARCH 	 tdwi.org
TDWI CHECKLIST REPORT: SEVEN BEST PRACTICES FOR ADOPTING DATA WAREHOUSE AUTOMATION
The excitement over new technologies can sometimes overwhelm
common sense. Organizations have invested significant amounts
of money and staff effort in architecting, developing, and
productionalizing their existing data warehouse and BI platforms. It
would be surprising, risky, and generally unwise for any organization
to completely rip out a trusted legacy data warehousing platform
and replace it with systems generated using DWA tools or move
directly to cloud-hosted data warehousing providers.
A more conservative approach would embrace a transition
strategy that incrementally migrates data warehousing and BI
applications from legacy platforms to more agile environments.
Start off by establishing an innovation lab within the existing
environment in which new techniques can be piloted and evaluated
for interoperability with existing systems. This allows different
approaches to be considered while constraining the alternatives to
ones that are compatible with the production environment.
Design a hybrid data warehousing environment architecture
that accommodates the introduction of new technologies and
emerging agile development paradigms while maintaining the
production operations of established conventional systems. Employ
virtualization methods to provide layers on top of existing platforms,
and abstract and differentiate the functionality from
its implementation.
Designing a hybrid architecture provides the flexibility to integrate
different data warehousing approaches that have already been
vetted. In turn, adopting the right technology mix allows the
application teams to develop facades abstracting the underlying
capabilities. This allows for reengineering without disrupting
production systems, enables dual operations during a testing period
to verify that the new approaches are trustworthy, and facilitates
seamless transitions for the user community when the time comes to
migrate applications.
Adjustments to the data warehousing tools and platform
infrastructure can potentially reduce development bottlenecks and
speed time to value. This decision triggers the evaluation of vendors
with products that purport to deliver on that promise, selecting
candidate alternatives, and choosing one (or more) to integrate
within the enterprise.
Balance the benefits and costs of using existing data warehouse
systems versus introducing newer technology. Recognize that
introducing new technology into the organization requires more than
just purchasing the license and installing the tool. It also requires
design and development time, an integration effort, training to
empower product users, and a communications plan to transition
legacy users to modernized platforms. Any plan for data warehouse
environment modernization must incorporate the cost and resources
needed to support those tasks to assess the potential return on
investment and to compare and contrast candidate technologies.
Develop a value model for comparing data warehousing alternatives
both against the existing platform as well as against each other. The
key is to select the right variables for comparison that will lead to
greater agility, lower costs, and better outcomes. Some variables for
comparison include:
•	 	Application development complexity. How easy is it to design,
configure, and deploy a new data warehouse or add new subject
areas within an existing data warehouse environment?
•	 	Application development time. How long does it take for a data
warehouse to become operational, focusing on the end-to-end
process of design, development, and implementation?
•	 	Skills requirements. What types of skills are required by the
development team members and how long does it take to acquire
those skills?
•	 	Report development turnaround time. How long is the cycle
time for new reports?
•	 	End-user ease of use. How easy is it to empower data
consumers to use the developed data warehouse without IT
support?
•	 	Resource requirements. What are the necessary resources for
implementation?
•	 	Cost. What are the accumulated start-up and operational costs?
This value model will provide quantitative comparisons to guide
technology selection and is likely to highlight the value of DWA tools.
DEVISE A VALUE MODEL FOR COMPARING DATA
WAREHOUSING ALTERNATIVES
ARCHITECT A HYBRID ENVIRONMENT
	 NUMBER TWO 	 NUMBER THREE
4  TDWI RESEARCH 	 tdwi.org
TDWI CHECKLIST REPORT: SEVEN BEST PRACTICES FOR ADOPTING DATA WAREHOUSE AUTOMATION
Our world has evolved into one where new, diverse sources of
massive amounts of data are emerging every day. From an analytics
perspective, the impact of the explosion of data sets originating
outside the typical enterprise is twofold. On the one hand, there is a
growing availability of information that can inform internal subject
area profiles, such as enhancing customer behavior information
by analyzing streaming social media posts. On the other hand, the
broad variety of both structured and unstructured data formats
creates significant complexity for the data warehouse professionals
who are unaccustomed to programming with Web services and APIs.
The conventional approach to data warehousing focuses on ingesting
one or two data sets at a time, originating from sources inside
the organization. However, as the speed, volume, and diversity of
externally sourced data grow, the organization must move faster
in absorbing data sources and putting those data streams to
productive use.
Integrating new data sources requires that:
•	 	The incoming data set is profiled as part of a discovery process
•	 	Representative models are designed for data ingestion
•	 	The incoming data model elements are mapped to existing data
warehouse data elements
•	 	Definitions are captured (or inferred) to ensure semantic
consistency between new data sources and existing data models
•	 	The data warehouse model must be augmented to absorb any
newly defined data elements of interest from the new data source
•	 	Transformations are programmed to align incoming data with the
corresponding data elements in the existing warehouse model
•	 	The incoming data sets are continuously monitored to verify that
the data exchange interface has not changed
Taking these steps prior to ingestion requires effective project
management, but ensuring the scheduling, operations, and fidelity
of production intake processes may overwhelm those choosing
to manually oversee the tasks. Data warehouse automation
simplifies these processes by automating discovery and alignment
of data source metadata with existing warehouse metadata and
orchestrating data ingestion, transformation, and loading. The
generated utilities are effectively self-documenting, enabling the
developers to understand what the generated code does.
Adopting the agile development methodology for data warehouse
development suggests that the days of the IT data practitioner
as the data warehouse gatekeeper are over. Business users are
significantly more knowledgeable than they were during the early
days of data warehousing, and they have a much lower dependence
on IT staff to meet many of their more mundane needs when it
comes to developing reports or performing simple queries.
Yet, as more enlightened business analysts devise sophisticated
analyses, the burden on IT staff only increases beyond the typical
development, operations, and maintenance of the data warehouse
platforms. Adopting DWA tools to support building and managing
data warehouses reduces the IT staff’s burden for design and
development. This provides more time for IT staff members to focus
on helping business users evaluate their specific business problems
and on how reporting and analysis can solve those problems (rather
than delivering reports in a virtual vacuum).
According to the Agile Alliance, the agile software development
methodology emphasizes “close collaboration between the
programmer team and business experts; face-to-face communication
(as more efficient than written documentation); frequent delivery
of new deployable business value; tight, self-organizing teams.”1
Fostering increased developer-user communication can reduce the
cycle time for developing reports and more complex analyses. Such
communication can also lead to increased user satisfaction.
Train your data professionals to actively engage business users,
effectively solicit business requirements, and (together with the
business users) translate those requirements into directives to
formulate reports and analyses that actually meet business needs.
By relying on tools that can automate data discovery, warehouse
modeling, and report creation, the collaborative knowledge transfer
between IT staff and their business partners will educate the
business analysts to become more self-sufficient. Business-user
self-sufficiency triggers a virtuous cycle by yet again reducing the
IT burden, freeing those resources to work with other business
users to engender more self-sufficiency to provide ever more
sophisticated analytics.
BE AGILE IN INGESTING NEW DATA SOURCES TRAIN DATA ENGINEERS TO COLLABORATE WITH
BUSINESS USERS
	 NUMBER FOUR 	 NUMBER FIVE
1
“What Is Agile Software Development?” downloaded November 6, 2015 from
http://www.agilealliance.org/the-alliance/what-is-agile.
5  TDWI RESEARCH 	 tdwi.org
TDWI CHECKLIST REPORT: SEVEN BEST PRACTICES FOR ADOPTING DATA WAREHOUSE AUTOMATION
The notion of a business intelligence competency center (BICC) is
motivated by the early challenges associated with data warehouse
development and deployment, specifically around data integration,
population of the data warehouse, and coordination among the
business users to ensure data quality, consistency, and leveraging
the investment in data warehouse environment technologies. The
original goal of the BICC was to focus on standardizing policies for
data warehouse use, centralizing the support of BI, and developing
good practices and repeatable processes across the organization.
Practically, though, in many environments two conflicting facets
hampered the success of the BICC. The first is the difficulty in
organizing an IT team intended to enforce policies across business
function boundaries. The second is the constraints on flexibility
imposed as a way to enforce data usage policies and standards.
As a result, the BICC often becomes the bottleneck as decisions
about technology and architecture are subsumed within demands for
report development and configuration. Essentially, strong centralized
oversight reduces agility in the organization, limiting business-user
gains from unfettered data exploration and discovery.
As organizations look to more agile methods to more quickly
allow business users to control their data use, it may be time to
reevaluate the goals of the BICC, determine where artificial barriers
to knowledge have been institutionalized, and assess ways that it
can be adapted to meet the needs of an increasingly enlightened
business community.
Examine how to revise the BICC’s charter to differentiate between
operational decision making associated with day-to-day tactical
management of the data warehouse environment and strategic
decision making related to evolving the environment over time. Adopt
guiding principles that expand data warehouse utilization, such
as simplifying the organizational data warehousing architecture,
acquiring tools that reduce overall time to value, simplifying
the repeatable processes for data warehouse instantiation, and
increasing hands-on training and knowledge transfer with business
partners. As business use increases, encourage business experts
to take on a role within the structure of the BICC so that its
management and oversight can be diffused among all stakeholders.
If becoming more agile means encouraging closer collaboration
between developers and business experts, we can take that a
step further in the data warehousing and BI world to enable the
transfer of skills from the IT professionals to the business experts to
make them self-sufficient. In effect, creating a more collaborative
environment between the IT/data staff and the business users
increases business-user independence and facilitates increased
information utilization.
At the same time, empowering self-sufficient business users
reduces business dependence on IT staff, reduces IT costs to
support enhanced data use, and frees IT resources to focus on BI
functionality, improved analytical precision, and expanded analytical
services.
Self-service BI leverages intuitive user interfaces and data
accessibility functionality that both guide the user in designing
reports and analyses and exercise controls over what can and
cannot be accessed. In addition, self-service BI relies on the
availability of a semantic metadata repository that lists business
terms and table column names and provides a shared glossary to
ensure consistency in use when formulating new reports.
One of the key benefits of self-service BI is that, as the business
users become more adept at developing their own analyses,
they can speed the review cycle for discovery of actionable
knowledge. However, supporting this decreased cycle time requires
complementary speed of data warehouse development. Agile tools
such as DWA can be used to quickly implement the warehouses and
marts that provide flexibility to end users in configuring their own
reports while limiting access to data needing additional protection.
RETHINK THE ROLE OF THE BI COMPETENCY CENTERSUPPORT BUSINESS SELF-SERVICE
	 NUMBER SEVEN	 NUMBER SIX
6  TDWI RESEARCH 	 tdwi.org
TDWI CHECKLIST REPORT: SEVEN BEST PRACTICES FOR ADOPTING DATA WAREHOUSE AUTOMATION
Business analysts have acquired a more cultivated awareness of
data discovery, data preparation, report formulation, and predictive
analytics and are increasingly taking control of their own reports
and analyses. These same data consumers are benefitting from the
explosion of data sources that can be captured within the analytics
environment. However, as the number of data sources increases
and the data sets become more diverse, maintaining a competitive
advantage hinges on the speed at which new data sources can be
acquired, ingested, and prepared for discovery and analysis.
It is unlikely that any organization will completely rip out their
existing data warehouses and replace them with any specific new
technology. Instead, the future data warehouse environment will
be an evolving hybrid environment composed of conventional data
warehouse architectures and high-performance components layered
on the Hadoop ecosystem, and it will perhaps include specialty
appliances or software accelerators such as columnar databases
and in-memory computing systems.
As this evolution proceeds, data warehouse automation tools help to
maintain agility while supporting the demands of legacy customers.
The items on this checklist help in preparing your organization for
defining assessment criteria, evaluating the existing environment,
determining where there are opportunities for agility, and adopting
DWA tools as a way to accelerate the transformation of the
enterprise analytics environment.
When evaluating alternatives, the tools guide the developers in
source analysis, in capturing requirements, and in automatically
generating the data integration, loading, and presentation
components of a data warehouse. The next step is to develop an
integration plan. Pilot the tools by focusing on developing a data
warehouse to support a specific business function or to analyze
a specific subject area (like customers or vendors). Align your
development methodology with the way that developers utilize the
tools and the ways users expect to use the resulting warehouse.
Reflect on lessons learned in terms of rapid design cycles,
empowering users with self-service capabilities, and ingesting new
data sources. These lessons will inform repeatable processes that
will guide the evolution of the future data warehouse environment.
CONSIDERATIONS
DEVELOPING A FEASIBLE INTEGRATION PLAN FOR
DATA WAREHOUSE AUTOMATION
7  TDWI RESEARCH 	 tdwi.org
TDWI CHECKLIST REPORT: SEVEN BEST PRACTICES FOR ADOPTING DATA WAREHOUSE AUTOMATION
TDWI Research provides research and advice for data professionals
worldwide. TDWI Research focuses exclusively on business
intelligence, data warehousing, and analytics issues and teams
up with industry thought leaders and practitioners to deliver both
broad and deep understanding of the business and technical
challenges surrounding the deployment and use of business
intelligence, data warehousing, and analytics solutions. TDWI
Research offers in-depth research reports, commentary, inquiry
services, and topical conferences as well as strategic planning
services to user and vendor organizations.
ABOUT TDWI RESEARCH
ABOUT THE AUTHOR
David Loshin, president of Knowledge Integrity, Inc., (www.
knowledge-integrity.com), is a recognized thought leader, TDWI
instructor, and expert consultant in the areas of data management
and business intelligence. David is a prolific author regarding
business intelligence best practices, as the author of numerous
books and papers on data management, including Big Data
Analytics: From Strategic Planning to Enterprise Integration with
Tools, Techniques, NoSQL, and Graph and The Practitioner’s Guide
to Data Quality Improvement, with additional content provided at
www.dataqualitybook.com. David is a frequent invited speaker at
conferences, Web seminars, and sponsored websites and channels
including TechTarget and The Bloor Group. His best-selling book
Master Data Management has been endorsed by many data
management industry leaders.
David can be reached at loshin@knowledge-integrity.com.
TDWI Checklist Reports provide an overview of success factors for
a specific project in business intelligence, data warehousing, or
a related data management discipline. Companies may use this
overview to get organized before beginning a project or to identify
goals and areas of improvement for current projects.
ABOUT TDWI CHECKLIST REPORTS
timextender.com
TimeXtender is a world-leading data warehouse automation vendor
dedicated to Microsoft SQL Server.
The TimeXtender software, TX DWA, revolutionizes the way a data
warehouse is developed and maintained by automating all manual
data warehouse processes—from design to development, operation,
and maintenance to change management. TimeXtender ensures an
improved and inexpensive solution that is fully documented.
The TX DWA software enables medium and large data-driven
enterprises to get business intelligence done faster, more efficiently,
and with less stress by providing “one truth” to improve decision-
making processes and overall business performance that reduces
costs and saves valuable time. TX DWA turns a business intelligence
project into a business intelligence process with a flexible solution
that expands as the business evolves and grows.
TimeXtender collaborates with VAR and OEM partners across six
continents, providing more than 2,600 customers in 61 countries
all over the world with its advanced data warehouse automation
software.
Why wait days before taking action? With TimeXtender, data is
available at your fingertips in mere hours!
ABOUT OUR SPONSOR

More Related Content

What's hot

Enterprise Data Management - Data Lake - A Perspective
Enterprise Data Management - Data Lake - A PerspectiveEnterprise Data Management - Data Lake - A Perspective
Enterprise Data Management - Data Lake - A Perspective
Saurav Mukherjee
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata Company
DataWorks Summit
 
Making the Case for Hadoop in a Large Enterprise-British Airways
Making the Case for Hadoop in a Large Enterprise-British AirwaysMaking the Case for Hadoop in a Large Enterprise-British Airways
Making the Case for Hadoop in a Large Enterprise-British Airways
DataWorks Summit
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seeling Cheung
 
Real Time Analytics
Real Time AnalyticsReal Time Analytics
Real Time Analytics
Mohsin Hakim
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
Cameron. A. Bradbury
 
Smarter Management for Your Data Growth
Smarter Management for Your Data GrowthSmarter Management for Your Data Growth
Smarter Management for Your Data Growth
RainStor
 
Scott-Lunceford-Tech-Samples
Scott-Lunceford-Tech-SamplesScott-Lunceford-Tech-Samples
Scott-Lunceford-Tech-SamplesScott Lunceford
 
Real Time Analytics
Real Time AnalyticsReal Time Analytics
Real Time Analytics
Mohsin Hakim
 
Hitachi data systems and tsys success story
Hitachi data systems and tsys success storyHitachi data systems and tsys success story
Hitachi data systems and tsys success story
Hitachi Vantara
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKRajesh Jayarman
 
Starting Your Modern DataOps Journey
Starting Your Modern DataOps JourneyStarting Your Modern DataOps Journey
Starting Your Modern DataOps Journey
CloverDX
 
The principles of the business data lake
The principles of the business data lakeThe principles of the business data lake
The principles of the business data lake
Capgemini
 
THE FUTURE OF DATA: PROVISIONING ANALYTICS-READY DATA AT SPEED
THE FUTURE OF DATA: PROVISIONING ANALYTICS-READY DATA AT SPEEDTHE FUTURE OF DATA: PROVISIONING ANALYTICS-READY DATA AT SPEED
THE FUTURE OF DATA: PROVISIONING ANALYTICS-READY DATA AT SPEED
webwinkelvakdag
 
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Denodo
 
Hadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality ChallengeHadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality Challenge
Inside Analysis
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)Moacyr Passador
 
Hadoop: Making it work for the Business Unit
Hadoop: Making it work for the Business UnitHadoop: Making it work for the Business Unit
Hadoop: Making it work for the Business UnitDataWorks Summit
 
Testing the Data Warehouse
Testing the Data WarehouseTesting the Data Warehouse
Testing the Data Warehouse
TechWell
 
Data summit connect fall 2020 - rise of data ops
Data summit connect fall 2020 - rise of data opsData summit connect fall 2020 - rise of data ops
Data summit connect fall 2020 - rise of data ops
Ryan Gross
 

What's hot (20)

Enterprise Data Management - Data Lake - A Perspective
Enterprise Data Management - Data Lake - A PerspectiveEnterprise Data Management - Data Lake - A Perspective
Enterprise Data Management - Data Lake - A Perspective
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata Company
 
Making the Case for Hadoop in a Large Enterprise-British Airways
Making the Case for Hadoop in a Large Enterprise-British AirwaysMaking the Case for Hadoop in a Large Enterprise-British Airways
Making the Case for Hadoop in a Large Enterprise-British Airways
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
 
Real Time Analytics
Real Time AnalyticsReal Time Analytics
Real Time Analytics
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
 
Smarter Management for Your Data Growth
Smarter Management for Your Data GrowthSmarter Management for Your Data Growth
Smarter Management for Your Data Growth
 
Scott-Lunceford-Tech-Samples
Scott-Lunceford-Tech-SamplesScott-Lunceford-Tech-Samples
Scott-Lunceford-Tech-Samples
 
Real Time Analytics
Real Time AnalyticsReal Time Analytics
Real Time Analytics
 
Hitachi data systems and tsys success story
Hitachi data systems and tsys success storyHitachi data systems and tsys success story
Hitachi data systems and tsys success story
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
Starting Your Modern DataOps Journey
Starting Your Modern DataOps JourneyStarting Your Modern DataOps Journey
Starting Your Modern DataOps Journey
 
The principles of the business data lake
The principles of the business data lakeThe principles of the business data lake
The principles of the business data lake
 
THE FUTURE OF DATA: PROVISIONING ANALYTICS-READY DATA AT SPEED
THE FUTURE OF DATA: PROVISIONING ANALYTICS-READY DATA AT SPEEDTHE FUTURE OF DATA: PROVISIONING ANALYTICS-READY DATA AT SPEED
THE FUTURE OF DATA: PROVISIONING ANALYTICS-READY DATA AT SPEED
 
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
 
Hadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality ChallengeHadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality Challenge
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
 
Hadoop: Making it work for the Business Unit
Hadoop: Making it work for the Business UnitHadoop: Making it work for the Business Unit
Hadoop: Making it work for the Business Unit
 
Testing the Data Warehouse
Testing the Data WarehouseTesting the Data Warehouse
Testing the Data Warehouse
 
Data summit connect fall 2020 - rise of data ops
Data summit connect fall 2020 - rise of data opsData summit connect fall 2020 - rise of data ops
Data summit connect fall 2020 - rise of data ops
 

Similar to December 2015 - TDWI Checklist Report - Seven Best Practices for Adapting DWA

The Four Pillars of Analytics Technology Whitepaper
The Four Pillars of Analytics Technology WhitepaperThe Four Pillars of Analytics Technology Whitepaper
The Four Pillars of Analytics Technology Whitepaper
Edgar Alejandro Villegas
 
Creating a Successful DataOps Framework for Your Business.pdf
Creating a Successful DataOps Framework for Your Business.pdfCreating a Successful DataOps Framework for Your Business.pdf
Creating a Successful DataOps Framework for Your Business.pdf
Enov8
 
CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
 
data_blending
data_blendingdata_blending
data_blendingsubit1615
 
TDWI checklist 2018 - Data Warehouse Infrastructure
TDWI checklist 2018 - Data Warehouse InfrastructureTDWI checklist 2018 - Data Warehouse Infrastructure
TDWI checklist 2018 - Data Warehouse Infrastructure
Jeannette Browning
 
Leveraging Cloud for Non-Production Environments
Leveraging Cloud for Non-Production EnvironmentsLeveraging Cloud for Non-Production Environments
Leveraging Cloud for Non-Production Environments
Cognizant
 
Everything you wanted to know about data ops
Everything you wanted to know about data opsEverything you wanted to know about data ops
Everything you wanted to know about data ops
Enov8
 
DataOps , cbuswaw April '23
DataOps , cbuswaw April '23DataOps , cbuswaw April '23
DataOps , cbuswaw April '23
Jason Packer
 
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida  Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
CLARA CAMPROVIN
 
Data Lake-based Approaches to Regulatory-Driven Technology Challenges
Data Lake-based Approaches to Regulatory-Driven Technology ChallengesData Lake-based Approaches to Regulatory-Driven Technology Challenges
Data Lake-based Approaches to Regulatory-Driven Technology Challenges
Booz Allen Hamilton
 
Appliance Warehouse Service Plan.The discussion focuses on the.docx
Appliance Warehouse Service Plan.The discussion focuses on the.docxAppliance Warehouse Service Plan.The discussion focuses on the.docx
Appliance Warehouse Service Plan.The discussion focuses on the.docx
festockton
 
Appliance Warehouse Service Plan.The discussion focuses on the.docx
Appliance Warehouse Service Plan.The discussion focuses on the.docxAppliance Warehouse Service Plan.The discussion focuses on the.docx
Appliance Warehouse Service Plan.The discussion focuses on the.docx
RAHUL126667
 
Whitepaper: Datacenter Migration - Happiest Minds
Whitepaper: Datacenter Migration - Happiest MindsWhitepaper: Datacenter Migration - Happiest Minds
Whitepaper: Datacenter Migration - Happiest Minds
Happiest Minds Technologies
 
Agile Corporation for MIT
Agile Corporation for MITAgile Corporation for MIT
Agile Corporation for MIT
Caio Candido
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Nathan Bijnens
 
Accelerating Time to Success for Your Big Data Initiatives
Accelerating Time to Success for Your Big Data InitiativesAccelerating Time to Success for Your Big Data Initiatives
Accelerating Time to Success for Your Big Data Initiatives☁Jake Weaver ☁
 
Calto Commercial RIS Systems
Calto Commercial RIS SystemsCalto Commercial RIS Systems
Accelerate Innovation & Productivity With Rapid Prototyping & Development - ...
Accelerate Innovation & Productivity With Rapid Prototyping & Development -  ...Accelerate Innovation & Productivity With Rapid Prototyping & Development -  ...
Accelerate Innovation & Productivity With Rapid Prototyping & Development - ...
Attivio
 
What to focus on when choosing a Business Intelligence tool?
What to focus on when choosing a Business Intelligence tool? What to focus on when choosing a Business Intelligence tool?
What to focus on when choosing a Business Intelligence tool?
Marketplanet
 

Similar to December 2015 - TDWI Checklist Report - Seven Best Practices for Adapting DWA (20)

The Four Pillars of Analytics Technology Whitepaper
The Four Pillars of Analytics Technology WhitepaperThe Four Pillars of Analytics Technology Whitepaper
The Four Pillars of Analytics Technology Whitepaper
 
Creating a Successful DataOps Framework for Your Business.pdf
Creating a Successful DataOps Framework for Your Business.pdfCreating a Successful DataOps Framework for Your Business.pdf
Creating a Successful DataOps Framework for Your Business.pdf
 
CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014
 
data_blending
data_blendingdata_blending
data_blending
 
TDWI checklist 2018 - Data Warehouse Infrastructure
TDWI checklist 2018 - Data Warehouse InfrastructureTDWI checklist 2018 - Data Warehouse Infrastructure
TDWI checklist 2018 - Data Warehouse Infrastructure
 
Leveraging Cloud for Non-Production Environments
Leveraging Cloud for Non-Production EnvironmentsLeveraging Cloud for Non-Production Environments
Leveraging Cloud for Non-Production Environments
 
Everything you wanted to know about data ops
Everything you wanted to know about data opsEverything you wanted to know about data ops
Everything you wanted to know about data ops
 
DataOps , cbuswaw April '23
DataOps , cbuswaw April '23DataOps , cbuswaw April '23
DataOps , cbuswaw April '23
 
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida  Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
 
Data Lake-based Approaches to Regulatory-Driven Technology Challenges
Data Lake-based Approaches to Regulatory-Driven Technology ChallengesData Lake-based Approaches to Regulatory-Driven Technology Challenges
Data Lake-based Approaches to Regulatory-Driven Technology Challenges
 
Appliance Warehouse Service Plan.The discussion focuses on the.docx
Appliance Warehouse Service Plan.The discussion focuses on the.docxAppliance Warehouse Service Plan.The discussion focuses on the.docx
Appliance Warehouse Service Plan.The discussion focuses on the.docx
 
Appliance Warehouse Service Plan.The discussion focuses on the.docx
Appliance Warehouse Service Plan.The discussion focuses on the.docxAppliance Warehouse Service Plan.The discussion focuses on the.docx
Appliance Warehouse Service Plan.The discussion focuses on the.docx
 
Whitepaper: Datacenter Migration - Happiest Minds
Whitepaper: Datacenter Migration - Happiest MindsWhitepaper: Datacenter Migration - Happiest Minds
Whitepaper: Datacenter Migration - Happiest Minds
 
Agile Corporation for MIT
Agile Corporation for MITAgile Corporation for MIT
Agile Corporation for MIT
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
 
Accelerating Time to Success for Your Big Data Initiatives
Accelerating Time to Success for Your Big Data InitiativesAccelerating Time to Success for Your Big Data Initiatives
Accelerating Time to Success for Your Big Data Initiatives
 
Calto Commercial RIS Systems
Calto Commercial RIS SystemsCalto Commercial RIS Systems
Calto Commercial RIS Systems
 
Accelerate Innovation & Productivity With Rapid Prototyping & Development - ...
Accelerate Innovation & Productivity With Rapid Prototyping & Development -  ...Accelerate Innovation & Productivity With Rapid Prototyping & Development -  ...
Accelerate Innovation & Productivity With Rapid Prototyping & Development - ...
 
What to focus on when choosing a Business Intelligence tool?
What to focus on when choosing a Business Intelligence tool? What to focus on when choosing a Business Intelligence tool?
What to focus on when choosing a Business Intelligence tool?
 
Greetings david cutler inform and connect
Greetings   david cutler inform and connectGreetings   david cutler inform and connect
Greetings david cutler inform and connect
 

December 2015 - TDWI Checklist Report - Seven Best Practices for Adapting DWA

  • 1. c1  TDWI RESEARCH tdwi.org TDWI CHECKLIST REPORT: SEVEN BEST PRACTICES FOR ADOPTING DATA WAREHOUSE AUTOMATION TDWI CHECKLIST REPORT TDWI RESEARCH tdwi.org Seven Best Practices for Adopting Data Warehouse Automation By David Loshin Sponsored by:
  • 2. 1  TDWI RESEARCH tdwi.org 2 FOREWORD 2 NUMBER ONE Analyze the cycle time for satisfying business requests 3 NUMBER TWO Devise a value model for comparing data warehousing alternatives 3 NUMBER THREE Architect a hybrid environment 4 NUMBER FOUR Be agile in ingesting new data sources 4 NUMBER FIVE Train data engineers to collaborate with business users 5 NUMBER SIX Support business self-service 5 NUMBER SEVEN Rethink the role of the BI competency center 6 CONSIDERATIONS Developing a feasible integration plan for data warehouse automation 7 ABOUT OUR SPONSOR 7 ABOUT THE AUTHOR 7 ABOUT TDWI RESEARCH 7 ABOUT TDWI CHECKLIST REPORTS © 2016 by TDWI (The Data Warehousing InstituteTM ), a division of 1105 Media, Inc. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. E-mail requests or feedback to info@tdwi.org. Product and company names mentioned herein may be trademarks and/or registered trademarks of their respective companies. JANUARY 2016 SEVEN BEST PRACTICES FOR ADOPTING DATA WAREHOUSE AUTOMATION By David Loshin TABLE OF CONTENTS 555 S Renton Village Place, Ste. 700 Renton, WA 98057-3295 T 425.277.9126 F 425.687.2842 E info@tdwi.org tdwi.org TDWI CHECKLIST REPORT
  • 3. 2  TDWI RESEARCH tdwi.org TDWI CHECKLIST REPORT: SEVEN BEST PRACTICES FOR ADOPTING DATA WAREHOUSE AUTOMATION The word agile conveys a number of meanings. Its simple definition as an adjective is “quick and well-coordinated,” but the term has taken on additional meaning in the context of system design and development, largely in terms of rapid development, increased partnership among information technology (IT) and business partners, and leveraging teamwork to achieve short-term objectives that build toward solving more complex business challenges. Applying agile development methodologies to data warehousing promises a number of benefits: quicker development and movement into production, faster time to value, reduced start-up and overhead costs, and simplified access for business users. Yet there are some prerequisites to transitioning to using the agile approach for data warehousing: • Evaluate the existing environment to identify the best opportunities for achieving the benefits of adoption • Establish good practices for leveraging more agile technologies where possible • Identify the right technologies to facilitate that transition There are a number of technologies that accelerate design and development, improve cycle time in producing reports and analyses, and enhance the IT-business collaboration. Some are platform oriented, such as columnar databases, in-memory computing, and Hadoop, which all seek to leverage faster performance to improve analytical results. Alternatively, data warehouse automation (DWA) tools blend user requirements and repeatable processes to automatically generate the components of a data warehouse environment. These tools simplify the end-to-end production of a data warehouse, encompassing the entire development life cycle, including source system analysis, design, development, generation of data integration scripts, building, deployment, generation of documentation, testing, support for ongoing operations, impact analysis, and change management. We are rapidly moving away from the monolithic, single-system enterprise data warehouse and toward a hybrid environment that uses the most appropriate technologies to address specific data challenges. That environment will encompass many components and will benefit from reduced complexity through the use of tools like DWA. This checklist discusses seven practices for determining the value proposition of adopting DWA and establishing the foundation that will ease its adoption. FOREWORD The success of the conventional IT management approach to data warehousing and business intelligence (BI) has opened the door for a growing population of data warehouse consumers, both within and outside of the organization. Although many of these consumers’ needs are met by existing reports or by providing access for straightforward queries, a combination of factors has created a bottleneck in rapidly addressing business-user demands. Many business analysts are becoming more sophisticated in their investigations, requiring additional consulting from their IT counterparts to develop data extracts and reports. At the same time, though, IT budgets and staffing remain constrained. The result is that scheduling limited IT consulting resources elongates the time from when a business user requests a data product to the time when that data product is delivered (“cycle time”). There are two main risks of long cycle times. First, the data product is delivered after the window of opportunity for taking advantage of its results. Second, frustrated users may abandon the use of the enterprise data warehouse and adopt their own “shadow” reporting and analytics tools and methods, bypassing any governance procedures intended to ensure enterprisewide consistency. Long cycle times and development bottlenecks lead to missed business opportunities. Therefore, increasing agility by eliminating those bottlenecks will increase the benefits of your reporting and analytics investment. To find where those bottlenecks hinder productivity, analyze the end- to-end process for satisfying BI and analysis requests. It is likely that the source of the logjam is in the design-develop-test loop for new reports and extracts; of course, automation tools can reduce or even eliminate the development blockage and speed time to value. Analyzing the report development cycle time provides a benchmark for the time and resources required (in general) to satisfy business needs. This benchmark can be used as the starting point for optimizing existing processes as well as providing a metric for evaluation of how DWA tools can speed time to value. ANALYZE THE CYCLE TIME FOR SATISFYING BUSINESS REQUESTS NUMBER ONE
  • 4. 3  TDWI RESEARCH tdwi.org TDWI CHECKLIST REPORT: SEVEN BEST PRACTICES FOR ADOPTING DATA WAREHOUSE AUTOMATION The excitement over new technologies can sometimes overwhelm common sense. Organizations have invested significant amounts of money and staff effort in architecting, developing, and productionalizing their existing data warehouse and BI platforms. It would be surprising, risky, and generally unwise for any organization to completely rip out a trusted legacy data warehousing platform and replace it with systems generated using DWA tools or move directly to cloud-hosted data warehousing providers. A more conservative approach would embrace a transition strategy that incrementally migrates data warehousing and BI applications from legacy platforms to more agile environments. Start off by establishing an innovation lab within the existing environment in which new techniques can be piloted and evaluated for interoperability with existing systems. This allows different approaches to be considered while constraining the alternatives to ones that are compatible with the production environment. Design a hybrid data warehousing environment architecture that accommodates the introduction of new technologies and emerging agile development paradigms while maintaining the production operations of established conventional systems. Employ virtualization methods to provide layers on top of existing platforms, and abstract and differentiate the functionality from its implementation. Designing a hybrid architecture provides the flexibility to integrate different data warehousing approaches that have already been vetted. In turn, adopting the right technology mix allows the application teams to develop facades abstracting the underlying capabilities. This allows for reengineering without disrupting production systems, enables dual operations during a testing period to verify that the new approaches are trustworthy, and facilitates seamless transitions for the user community when the time comes to migrate applications. Adjustments to the data warehousing tools and platform infrastructure can potentially reduce development bottlenecks and speed time to value. This decision triggers the evaluation of vendors with products that purport to deliver on that promise, selecting candidate alternatives, and choosing one (or more) to integrate within the enterprise. Balance the benefits and costs of using existing data warehouse systems versus introducing newer technology. Recognize that introducing new technology into the organization requires more than just purchasing the license and installing the tool. It also requires design and development time, an integration effort, training to empower product users, and a communications plan to transition legacy users to modernized platforms. Any plan for data warehouse environment modernization must incorporate the cost and resources needed to support those tasks to assess the potential return on investment and to compare and contrast candidate technologies. Develop a value model for comparing data warehousing alternatives both against the existing platform as well as against each other. The key is to select the right variables for comparison that will lead to greater agility, lower costs, and better outcomes. Some variables for comparison include: • Application development complexity. How easy is it to design, configure, and deploy a new data warehouse or add new subject areas within an existing data warehouse environment? • Application development time. How long does it take for a data warehouse to become operational, focusing on the end-to-end process of design, development, and implementation? • Skills requirements. What types of skills are required by the development team members and how long does it take to acquire those skills? • Report development turnaround time. How long is the cycle time for new reports? • End-user ease of use. How easy is it to empower data consumers to use the developed data warehouse without IT support? • Resource requirements. What are the necessary resources for implementation? • Cost. What are the accumulated start-up and operational costs? This value model will provide quantitative comparisons to guide technology selection and is likely to highlight the value of DWA tools. DEVISE A VALUE MODEL FOR COMPARING DATA WAREHOUSING ALTERNATIVES ARCHITECT A HYBRID ENVIRONMENT NUMBER TWO NUMBER THREE
  • 5. 4  TDWI RESEARCH tdwi.org TDWI CHECKLIST REPORT: SEVEN BEST PRACTICES FOR ADOPTING DATA WAREHOUSE AUTOMATION Our world has evolved into one where new, diverse sources of massive amounts of data are emerging every day. From an analytics perspective, the impact of the explosion of data sets originating outside the typical enterprise is twofold. On the one hand, there is a growing availability of information that can inform internal subject area profiles, such as enhancing customer behavior information by analyzing streaming social media posts. On the other hand, the broad variety of both structured and unstructured data formats creates significant complexity for the data warehouse professionals who are unaccustomed to programming with Web services and APIs. The conventional approach to data warehousing focuses on ingesting one or two data sets at a time, originating from sources inside the organization. However, as the speed, volume, and diversity of externally sourced data grow, the organization must move faster in absorbing data sources and putting those data streams to productive use. Integrating new data sources requires that: • The incoming data set is profiled as part of a discovery process • Representative models are designed for data ingestion • The incoming data model elements are mapped to existing data warehouse data elements • Definitions are captured (or inferred) to ensure semantic consistency between new data sources and existing data models • The data warehouse model must be augmented to absorb any newly defined data elements of interest from the new data source • Transformations are programmed to align incoming data with the corresponding data elements in the existing warehouse model • The incoming data sets are continuously monitored to verify that the data exchange interface has not changed Taking these steps prior to ingestion requires effective project management, but ensuring the scheduling, operations, and fidelity of production intake processes may overwhelm those choosing to manually oversee the tasks. Data warehouse automation simplifies these processes by automating discovery and alignment of data source metadata with existing warehouse metadata and orchestrating data ingestion, transformation, and loading. The generated utilities are effectively self-documenting, enabling the developers to understand what the generated code does. Adopting the agile development methodology for data warehouse development suggests that the days of the IT data practitioner as the data warehouse gatekeeper are over. Business users are significantly more knowledgeable than they were during the early days of data warehousing, and they have a much lower dependence on IT staff to meet many of their more mundane needs when it comes to developing reports or performing simple queries. Yet, as more enlightened business analysts devise sophisticated analyses, the burden on IT staff only increases beyond the typical development, operations, and maintenance of the data warehouse platforms. Adopting DWA tools to support building and managing data warehouses reduces the IT staff’s burden for design and development. This provides more time for IT staff members to focus on helping business users evaluate their specific business problems and on how reporting and analysis can solve those problems (rather than delivering reports in a virtual vacuum). According to the Agile Alliance, the agile software development methodology emphasizes “close collaboration between the programmer team and business experts; face-to-face communication (as more efficient than written documentation); frequent delivery of new deployable business value; tight, self-organizing teams.”1 Fostering increased developer-user communication can reduce the cycle time for developing reports and more complex analyses. Such communication can also lead to increased user satisfaction. Train your data professionals to actively engage business users, effectively solicit business requirements, and (together with the business users) translate those requirements into directives to formulate reports and analyses that actually meet business needs. By relying on tools that can automate data discovery, warehouse modeling, and report creation, the collaborative knowledge transfer between IT staff and their business partners will educate the business analysts to become more self-sufficient. Business-user self-sufficiency triggers a virtuous cycle by yet again reducing the IT burden, freeing those resources to work with other business users to engender more self-sufficiency to provide ever more sophisticated analytics. BE AGILE IN INGESTING NEW DATA SOURCES TRAIN DATA ENGINEERS TO COLLABORATE WITH BUSINESS USERS NUMBER FOUR NUMBER FIVE 1 “What Is Agile Software Development?” downloaded November 6, 2015 from http://www.agilealliance.org/the-alliance/what-is-agile.
  • 6. 5  TDWI RESEARCH tdwi.org TDWI CHECKLIST REPORT: SEVEN BEST PRACTICES FOR ADOPTING DATA WAREHOUSE AUTOMATION The notion of a business intelligence competency center (BICC) is motivated by the early challenges associated with data warehouse development and deployment, specifically around data integration, population of the data warehouse, and coordination among the business users to ensure data quality, consistency, and leveraging the investment in data warehouse environment technologies. The original goal of the BICC was to focus on standardizing policies for data warehouse use, centralizing the support of BI, and developing good practices and repeatable processes across the organization. Practically, though, in many environments two conflicting facets hampered the success of the BICC. The first is the difficulty in organizing an IT team intended to enforce policies across business function boundaries. The second is the constraints on flexibility imposed as a way to enforce data usage policies and standards. As a result, the BICC often becomes the bottleneck as decisions about technology and architecture are subsumed within demands for report development and configuration. Essentially, strong centralized oversight reduces agility in the organization, limiting business-user gains from unfettered data exploration and discovery. As organizations look to more agile methods to more quickly allow business users to control their data use, it may be time to reevaluate the goals of the BICC, determine where artificial barriers to knowledge have been institutionalized, and assess ways that it can be adapted to meet the needs of an increasingly enlightened business community. Examine how to revise the BICC’s charter to differentiate between operational decision making associated with day-to-day tactical management of the data warehouse environment and strategic decision making related to evolving the environment over time. Adopt guiding principles that expand data warehouse utilization, such as simplifying the organizational data warehousing architecture, acquiring tools that reduce overall time to value, simplifying the repeatable processes for data warehouse instantiation, and increasing hands-on training and knowledge transfer with business partners. As business use increases, encourage business experts to take on a role within the structure of the BICC so that its management and oversight can be diffused among all stakeholders. If becoming more agile means encouraging closer collaboration between developers and business experts, we can take that a step further in the data warehousing and BI world to enable the transfer of skills from the IT professionals to the business experts to make them self-sufficient. In effect, creating a more collaborative environment between the IT/data staff and the business users increases business-user independence and facilitates increased information utilization. At the same time, empowering self-sufficient business users reduces business dependence on IT staff, reduces IT costs to support enhanced data use, and frees IT resources to focus on BI functionality, improved analytical precision, and expanded analytical services. Self-service BI leverages intuitive user interfaces and data accessibility functionality that both guide the user in designing reports and analyses and exercise controls over what can and cannot be accessed. In addition, self-service BI relies on the availability of a semantic metadata repository that lists business terms and table column names and provides a shared glossary to ensure consistency in use when formulating new reports. One of the key benefits of self-service BI is that, as the business users become more adept at developing their own analyses, they can speed the review cycle for discovery of actionable knowledge. However, supporting this decreased cycle time requires complementary speed of data warehouse development. Agile tools such as DWA can be used to quickly implement the warehouses and marts that provide flexibility to end users in configuring their own reports while limiting access to data needing additional protection. RETHINK THE ROLE OF THE BI COMPETENCY CENTERSUPPORT BUSINESS SELF-SERVICE NUMBER SEVEN NUMBER SIX
  • 7. 6  TDWI RESEARCH tdwi.org TDWI CHECKLIST REPORT: SEVEN BEST PRACTICES FOR ADOPTING DATA WAREHOUSE AUTOMATION Business analysts have acquired a more cultivated awareness of data discovery, data preparation, report formulation, and predictive analytics and are increasingly taking control of their own reports and analyses. These same data consumers are benefitting from the explosion of data sources that can be captured within the analytics environment. However, as the number of data sources increases and the data sets become more diverse, maintaining a competitive advantage hinges on the speed at which new data sources can be acquired, ingested, and prepared for discovery and analysis. It is unlikely that any organization will completely rip out their existing data warehouses and replace them with any specific new technology. Instead, the future data warehouse environment will be an evolving hybrid environment composed of conventional data warehouse architectures and high-performance components layered on the Hadoop ecosystem, and it will perhaps include specialty appliances or software accelerators such as columnar databases and in-memory computing systems. As this evolution proceeds, data warehouse automation tools help to maintain agility while supporting the demands of legacy customers. The items on this checklist help in preparing your organization for defining assessment criteria, evaluating the existing environment, determining where there are opportunities for agility, and adopting DWA tools as a way to accelerate the transformation of the enterprise analytics environment. When evaluating alternatives, the tools guide the developers in source analysis, in capturing requirements, and in automatically generating the data integration, loading, and presentation components of a data warehouse. The next step is to develop an integration plan. Pilot the tools by focusing on developing a data warehouse to support a specific business function or to analyze a specific subject area (like customers or vendors). Align your development methodology with the way that developers utilize the tools and the ways users expect to use the resulting warehouse. Reflect on lessons learned in terms of rapid design cycles, empowering users with self-service capabilities, and ingesting new data sources. These lessons will inform repeatable processes that will guide the evolution of the future data warehouse environment. CONSIDERATIONS DEVELOPING A FEASIBLE INTEGRATION PLAN FOR DATA WAREHOUSE AUTOMATION
  • 8. 7  TDWI RESEARCH tdwi.org TDWI CHECKLIST REPORT: SEVEN BEST PRACTICES FOR ADOPTING DATA WAREHOUSE AUTOMATION TDWI Research provides research and advice for data professionals worldwide. TDWI Research focuses exclusively on business intelligence, data warehousing, and analytics issues and teams up with industry thought leaders and practitioners to deliver both broad and deep understanding of the business and technical challenges surrounding the deployment and use of business intelligence, data warehousing, and analytics solutions. TDWI Research offers in-depth research reports, commentary, inquiry services, and topical conferences as well as strategic planning services to user and vendor organizations. ABOUT TDWI RESEARCH ABOUT THE AUTHOR David Loshin, president of Knowledge Integrity, Inc., (www. knowledge-integrity.com), is a recognized thought leader, TDWI instructor, and expert consultant in the areas of data management and business intelligence. David is a prolific author regarding business intelligence best practices, as the author of numerous books and papers on data management, including Big Data Analytics: From Strategic Planning to Enterprise Integration with Tools, Techniques, NoSQL, and Graph and The Practitioner’s Guide to Data Quality Improvement, with additional content provided at www.dataqualitybook.com. David is a frequent invited speaker at conferences, Web seminars, and sponsored websites and channels including TechTarget and The Bloor Group. His best-selling book Master Data Management has been endorsed by many data management industry leaders. David can be reached at loshin@knowledge-integrity.com. TDWI Checklist Reports provide an overview of success factors for a specific project in business intelligence, data warehousing, or a related data management discipline. Companies may use this overview to get organized before beginning a project or to identify goals and areas of improvement for current projects. ABOUT TDWI CHECKLIST REPORTS timextender.com TimeXtender is a world-leading data warehouse automation vendor dedicated to Microsoft SQL Server. The TimeXtender software, TX DWA, revolutionizes the way a data warehouse is developed and maintained by automating all manual data warehouse processes—from design to development, operation, and maintenance to change management. TimeXtender ensures an improved and inexpensive solution that is fully documented. The TX DWA software enables medium and large data-driven enterprises to get business intelligence done faster, more efficiently, and with less stress by providing “one truth” to improve decision- making processes and overall business performance that reduces costs and saves valuable time. TX DWA turns a business intelligence project into a business intelligence process with a flexible solution that expands as the business evolves and grows. TimeXtender collaborates with VAR and OEM partners across six continents, providing more than 2,600 customers in 61 countries all over the world with its advanced data warehouse automation software. Why wait days before taking action? With TimeXtender, data is available at your fingertips in mere hours! ABOUT OUR SPONSOR