White Paper Data Quality Process Design For Ad Hoc Reporting

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    White Paper Data Quality Process Design For Ad Hoc Reporting - Presentation Transcript

    1. A McKeel Research LLC White Paper 455 Newport Way Suite 103 Issaquah, Washington 98027 (425) 996-0427 Data Quality Process Design for Ad-Hoc Reporting By Jim Atwater, Principal Consultant Management Analytics Practice September 2008
    2. Data Quality Process Design for Ad-Hoc Reporting McKeel Research, LLC All rights reserved Introduction Contents This white paper provides an overview of some of the key objects contained within a Introduction 2 baseline data-cleansing subsystem for use by Problem Statement 2 ad-hoc reporting solutions, be they Previous Options 2 relational, dimensional or somewhere in Our Solution 3 between. The key scenario is based on experience in enterprise sales and marketing Implementation 3 work groups responsible for metrics and Summary 3 analytics. Problem Statement Business organizations have come to realize the value of dimensional data modeling. This is particularly the case when it comes to the “one version of the truth” level of rigor such systems bring to issues of data quality. Unfortunately, complexity inherent in a proper data warehouse implementation puts such tactics outside the reach of many sales and marketing workgroups, even in large enterprise organizations. Barriers include lack of skilled resources, time and commitment required in the analysis phase, and expense compared to relationally-based legacy ad-hoc reporting solutions. Previous Options Legacy relational solutions typically build reporting solutions directly on source- system data. Data cleansing and auditing is typically compiled after the fact by analysts as footnotes to the reports. This practice wastes time, causes errors, and leaves a rich source of analytical information untapped. As such workgroups evolve, the most common errors tend to surface by virtue of their repetition and lead to “fixes” in the reports themselves, usually along the lines of computations within the reports that only serve to obfuscate the source data. September 2008
    3. Data Quality Process Design for Ad-Hoc Reporting McKeel Research, LLC All rights reserved Our Solution This simple benefit guarantees one version of the truth while maintaining an informed Our solution is to leverage key data level of trust that is otherwise mixed into the quality aspects of the transform reporting data stream. procedures detailed by the Kimball Group for enterprise data warehousing solutions. This Data Warehouse “Glide Path” approach provides three key benefits:  More robust data quality By implementing the accepted best practice  Integrity of the source system for data quality in the data warehousing field, workgroups have armed themselves data metadata that is easily understood by data  A “glide path” toward the warehouse implementers. More importantly, data warehouse they have purchased for themselves a “seat at the table” in future cost containment and report centralization efforts. Data Quality Benefits The basis of our solution lies in a metadata store of specific screens, Implementation each of which serves to quantify Implementation of the solution is designed specific aspects of each data record. to fit into the existing workflow of a typical Screens can enforce column sales or marketing analytics team. properties within each record, the Automation of the existing reports and the structural relation of columns to each standard “what decisions do you make using other, or logical business rules that this data” kinds of analysis form the normal check individual or aggregate data weekly workflow. These efforts lead to the values. The upshot is a data quality screen definitions. score that is applied to each record. This effort is actualized by the baseline data The added value is that data quality quality code within the Microsoft SQL metrics are an authentic data source. Server Integration Services (SSIS) toolset. They guide both report owners and Once the codebase is in place, the screens producers to concentrate data are brought to bear and the key error and cleansing efforts on the source audit deliverables mature naturally over systems where they belong. time. Source System Data Integrity Summary Data integrity is preserved in a Data quality is something all ad-hoc pristine state by virtue of the reporting systems do at some point. Ideally, separation of data between the before your V.P. pitches a fit in the middle source systems and the QA screen of a big meeting. By building in a metadata- metrics. Chiefly, the QA metrics driven data screening facility, this solution take the form of an audit dimension adds auditing and error handling to the whose columns can be either existing reporting and pays tangible integrated into existing report queries dividends going forward. or delivered separately in the resulting workbook or deck.

    + macrochaoticmacrochaotic, 11 months ago

    custom

    544 views, 0 favs, 1 embeds more stats

    A white paper on the application of dimensional dat more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 544
      • 542 on SlideShare
      • 2 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 0
    Most viewed embeds
    • 2 views on http://www.lmodules.com

    more

    All embeds
    • 2 views on http://www.lmodules.com

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?