INFORMATION ARCHITECTUREFOR DWH PROJECTSRUAIRI PRENDIVILLESENIOR CONSULTANT , SYBASE (UK)JUNE I4TH, 2012, ISTANBUL
AGENDA         • Introduction to DWH            – Are DWH project complex, challenges, requirements         • Information ...
COMPLEXITY OF DWH PROJECTS         ARE DWH PROJECTS COMPLEX?3 – Company Confidential – June 18, 2012
4 – Company Confidential – June 18, 2012
BI/DWH COMPLEXITY         Causes, sources                – Data Sources                             Different sources, te...
BI/DWH COMPLEXITY         Change, Heterogeneity                – Never enough                             Users/BI Analys...
BI/DWH COMPLEXITY         Volume and growth, data quality                – Large Volume and intensive Growth is inevitable...
BI/DWH COMPLEXITY         Performance                – Never fast enough                             Use of technology no...
INFORMATION ARCHITECTURE FOR DWH         WORKFLOWS AND MODELS9 – Company Confidential – June 18, 2012
INFORMATION ARCHITECTURE FOR DWH         Position of EA10 – Company Confidential – June 18, 2012
INFORMATION ARCHITECTURE FOR DWH         Position of IA                                            Motivation, goals, busi...
INFORMATION ARCHITECTURE FOR DWH         Method (ADM)               – To Define Architecture Development Method           ...
INFORMATION ARCHITECTURE FOR DWH         Main ADM Cycle     Requirements&Constraints in     the center     Not all are man...
INFORMATION ARCHITECTURE FOR DWH         B.Data Analysis         – Objective:                      Discover, identify, co...
INFORMATION ARCHITECTURE FOR DWH         C.DWH Design and Implementation         – Objective:                      D&I of...
INFORMATION ARCHITECTURE FOR DWH         D.ETL Design and Implementation         – Objective:                      To ana...
INFORMATION ARCHITECTURE FOR DWH         E.BI Design and Implementation         – Objective:                      Establi...
INFORMATION ARCHITECTURE FOR DWH         Paths – Simplified Analysis, Design and Implementation     Generally many paths a...
INFORMATION ARCHITECTURE FOR DWH         Models – Business&Information Architecture               – Specifies             ...
INFORMATION ARCHITECTURE FOR DWH         Models - Source Model               – Specifies                            detai...
INFORMATION ARCHITECTURE FOR DWH         Models – DWH model               – Specifies                            details ...
INFORMATION ARCHITECTURE FOR DWH         Models – Data Flow model               – Specifies                            Co...
MAPPING ON SYBASE POWERDESIGNER         DWH RELATED MODELS AND FEATURES23 – Company Confidential – June 18, 2012
MAPPING ON SYBASE POWERDESIGNER         Data Models24 – Company Confidential – June 18, 2012
MAPPING ON SYBASE POWERDESIGNER         Architecture and Requirements25 – Company Confidential – June 18, 2012
MAPPING ON SYBASE POWERDESIGNER         DHW related features – Dependency Matrix         – What                      Two ...
MAPPING ON SYBASE POWERDESIGNER         DHW related features – Mappings         – What                      Modeling Conn...
MAPPING ON SYBASE POWERDESIGNER         DHW related features – Impact/Lineage Analysis         – What                    ...
DEMONSTRATION         EXAMPLE29 – Company Confidential – June 18, 2012
REFERENT INFORMATION ARCHITECTURE         MAIN LAYERS OF INFORMATION ARCHITECTURE30 – Company Confidential – June 18, 2012
REFERENT INFORMATION ARCHITECTURE         Recommended architecture - example31 – Company Confidential – June 18, 2012
REFERENT INFORMATION ARCHITECTURE         Data Access Layer - recommendations               – Set of processes, tools, act...
REFERENT INFORMATION ARCHITECTURE         DWH Layer and Recommendations               – DWH, Staging, Trash and ODS       ...
REFERENT INFORMATION ARCHITECTURE         Delivery Layer and Recommendations               – Set of processes, tools, acti...
REFERENT INFORMATION ARCHITECTURE         MDM Layer and Recommendations               – Data marts                        ...
INFORMATION ARCHITECTUREFOR DWH PROJECTSQUESTIONS?RUAIRI PRENDIVILLESENIOR CONSULTANT , SYBASE (UK)JUNE I4TH, 2012, ISTANBUL
Information Architech and DWH with PowerDesigner
Upcoming SlideShare
Loading in …5
×

Information Architech and DWH with PowerDesigner

1,573 views

Published on

14 Haziran 2012 tarihinde Sybase Türkiye tarafından yapılan PoerDesigner etkinliğindeki DWH ve IA projelerindeki PowerDesigner'in önemi konulu sunum

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,573
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
65
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Information Architech and DWH with PowerDesigner

  1. 1. INFORMATION ARCHITECTUREFOR DWH PROJECTSRUAIRI PRENDIVILLESENIOR CONSULTANT , SYBASE (UK)JUNE I4TH, 2012, ISTANBUL
  2. 2. AGENDA • Introduction to DWH – Are DWH project complex, challenges, requirements • Information Architecture (EA) for DWH – Models, Workflows, Artifacts • Mapping over Sybase PowerDesigner – Mapping to IA models and artifacts – Features of interest for DWH • Demonstration – One example of IA architecture on the project • DWH Referent IA – Layers and recommendations2 – Company Confidential – June 18, 2012
  3. 3. COMPLEXITY OF DWH PROJECTS ARE DWH PROJECTS COMPLEX?3 – Company Confidential – June 18, 2012
  4. 4. 4 – Company Confidential – June 18, 2012
  5. 5. BI/DWH COMPLEXITY Causes, sources – Data Sources  Different sources, technologies, business functions, legacy, overlapping, concepts, elements – Scope and Performances, • Never enough, never on time, content variations(!), – Participants • Different backgrounds, knowledge, skills, motivation, visions, – Requirements • Continuous changes and extensions, – Growth and Development • Volume of data, people, reports and analysis, – Quality • Clean, right, correct on time, cleansing (division of resp.)5 – Company Confidential – June 18, 2012
  6. 6. BI/DWH COMPLEXITY Change, Heterogeneity – Never enough  Users/BI Analyst can not give definite, detailed, complete and precise specification of all reports/views in advance (!),  Changes are coming on the end, they are inevitable and continual, – Never on time  Every change should be implemented and used in usable time frame, before user forgets about it, – Never one and exactly one data source,  Different sources results in: – Different DMS technology, different refresh rate, different volume, different performances (management)..... – Data overlapping – consolidation,6 – Company Confidential – June 18, 2012
  7. 7. BI/DWH COMPLEXITY Volume and growth, data quality – Large Volume and intensive Growth is inevitable,  Operational Data Sources, – Keep only data set needed for operational work (year?), – Keep it in the shape suitable for operational work (Relational), – Analytical Extension,  Keep data needed for sound analysis (many years?),  Keep it in the shape suitable for analysis (MDM), – Growth is inevitable per time and volume, – Compromises:  Time: keep last (x) months or representative sample,  Nobody is actually happy, neither IT or Business – Data Quality  Clean, consolidated data source does not exists,  Every data source needs “housekeeping” constantly, Data Entry/ETL7 – Company Confidential – June 18, 2012
  8. 8. BI/DWH COMPLEXITY Performance – Never fast enough  Use of technology not designed/suited for analytics (RDBMS),  Intensive use of „ad hoc“ request – indexing problem (RDBMS), – Free exploration over arbitrary data set is heavily limited,  Very intensive and heavy administration – never ending story, – One and only one complete “Version of the truth”  Similar or overlapping analysis are presenting different data!! – Which one is correct and right? What about the rest of it?  You are publisher – hold the responsibilities, – Hold the readers trust, – Publish on regular basis, – Use variety of sources and edit them with quality and consistency,  Data consistency must be established and protected,8 – Company Confidential – June 18, 2012
  9. 9. INFORMATION ARCHITECTURE FOR DWH WORKFLOWS AND MODELS9 – Company Confidential – June 18, 2012
  10. 10. INFORMATION ARCHITECTURE FOR DWH Position of EA10 – Company Confidential – June 18, 2012
  11. 11. INFORMATION ARCHITECTURE FOR DWH Position of IA Motivation, goals, business principles organizational structure, Business Functions, Services and Processes IT support for Business Architecture System Services, Applications, Databases, Components, Forms, Reports, Data Flows.... Technology for IS Architecture, network, servers, installed instances, Access points, From current to planned11 – Company Confidential – June 18, 2012
  12. 12. INFORMATION ARCHITECTURE FOR DWH Method (ADM) – To Define Architecture Development Method  Define at any point of the project Who is doing What, How and When  Define Phases, Workflows, Artifacts, Models, and Deliverable – Essential for DWH/BI with Backward Requirement process – Presented ADM and IA is:  Agile – simplification of RUP, TOGAF  Iterative and Incremental – cyclic repetition of workflows,  Data Driven – based on Data Assets  Comprehensive – includes all activities including maintenance and RFC  Model Driven – all artifacts are represented with modeling artifacts  Requirement Driven – placed in the center of methodology  Sustainable – at any point knowledge is collected, formally specified and properly presented12 – Company Confidential – June 18, 2012
  13. 13. INFORMATION ARCHITECTURE FOR DWH Main ADM Cycle Requirements&Constraints in the center Not all are mandatory Presented main cycle, others possible Many cycles are expected All that is needed to obtain sustainable system • Development • Deployment • Maintenance Active, in IME13 – Company Confidential – June 18, 2012
  14. 14. INFORMATION ARCHITECTURE FOR DWH B.Data Analysis – Objective:  Discover, identify, collect, elaborate, specify, define and present Data Assets  Different abstraction levels: from conceptual to implementation – Viewpoints:  Architectural viewpoint, Data Providers and Consumers, data flow process, engaged systems, applications, components, usage, access rights  Structural viewpoint, structure, attributes, relationships, dependencies, rules applied on conceptual and physical level – Inputs:  IA (others not in the scope), – Outputs  DA, Sources Conceptual and Physical Data14 – Company Confidential – June 18, 2012
  15. 15. INFORMATION ARCHITECTURE FOR DWH C.DWH Design and Implementation – Objective:  D&I of integrated and unified data collection, organized in dimension of time, which is subject oriented used for analysis, planning and evaluation of business performances  Establish common view (unified/integrated/complete) over the enterprise data, stable source of historical information, accommodate data growth – Activities:  Full and detailed schema specification for DWH, Staging and ODS – Inputs:  IA & DA, Sources Conceptual/Physical Data, – Outputs  IA & DA, Conceptual and Physical DWH15 – Company Confidential – June 18, 2012
  16. 16. INFORMATION ARCHITECTURE FOR DWH D.ETL Design and Implementation – Objective:  To analyze, elaborate, define, specify, present and implement full and incremental ETL flows between source2staging, staging2DWH, DWH2MDM  To discover, identify, collect, elaborate, specify, define and present all characteristics of ETL flows, – Activities:  Extraction Method, Schema, Condition and Frequency for increments, – Source 2 Target Mapping,  Transformation processes on the appropriate level of details,  Data Flow Architecture, Trash management – Inputs:  IA&DA, Sources Conceptual/Physical Data and DWH, – Outputs  DA, Data Flow, Sources Conceptual/Physical Data,16 – Company Confidential – June 18, 2012
  17. 17. INFORMATION ARCHITECTURE FOR DWH E.BI Design and Implementation – Objective:  Establish Multidimensional space (Business Universe) with Facts, Measures, Dimensions and Hierarchies,  Build visualization including Reports, Dashboards, OLAP views,  Check if requested KPI set is supported and presented, – Activities:  MDM Space (above)  Detailed specification of requested KPI with mapping  Detailed specification of Reports, Dashboards and OLAP Views  Access rights and delivery mechanisms – Inputs:  DWH, IA and DA – Outputs  IA, DA, DWH (MDM) models17 – Company Confidential – June 18, 2012
  18. 18. INFORMATION ARCHITECTURE FOR DWH Paths – Simplified Analysis, Design and Implementation Generally many paths are possible Gap Analysis may discover missing info Analysis, Design and Implementation of DWH, ETL and BI are tightly interconnected and dependent on each other18 – Company Confidential – June 18, 2012
  19. 19. INFORMATION ARCHITECTURE FOR DWH Models – Business&Information Architecture – Specifies  Application systems and applications,  Data Assets, Databases, Data Source/Destination, Data Providers/Consumers,  Usage of Data Assets and Applications, cooperation and collaboration of Applications and/or services,  ownership over the Data Assets, Applications and Services, elements of SLA  ETL procedures on high abstraction level. – Viewpoints  Architecture of the system – Represents a “hat” for the rest of the system19 – Company Confidential – June 18, 2012
  20. 20. INFORMATION ARCHITECTURE FOR DWH Models - Source Model – Specifies  details to understand structural relationships and meaning (conceptual)  internal structure of data source with implementation details (physical) – tables, columns, views, keys, procedures, indexes, rights, constraints, triggers  consolidated by bidirectional synchronization and associated transformation  Specifies extraction scheme for every data source,  Source for Source2Target mapping – Viewpoint  One or more diagrams per source to represent subject area – Used to synchronize changes from source into DWH IA  Starting point for change management20 – Company Confidential – June 18, 2012
  21. 21. INFORMATION ARCHITECTURE FOR DWH Models – DWH model – Specifies  details to understand structural relationships and meaning (conceptual)  internal structure of DWH with implementation details (physical),  Internal MDM structure of Data Marts (physical),  For Staging, Trash, ODS, DWH and MDM  Target for Source2Target mapping  Relational2MDM mapping, – Viewpoints  One or more diagrams to represent subject area  One or more MDM diagrams to represent Data Marts – Used to synchronize changes from DWH IA to actual RDBMS  Ending point for change management21 – Company Confidential – June 18, 2012
  22. 22. INFORMATION ARCHITECTURE FOR DWH Models – Data Flow model – Specifies  Connects all important data sources to destination  Data flows from source to destination with all attributes and constraints – characteristics of flow processes, source and destination tables, kind of the flow (ETL, replication or federation), integration preconditions, used integration service, possible outcomes etc.  Mapping source2target within every step of the flow – Viewpoints  One or more diagrams to represent actual flow task – Aggregation, sort, filter, projection, split, join, merge, lookup  One or more diagrams to represent transformation control flow – Used to present integration, consolidation and migration22 – Company Confidential – June 18, 2012
  23. 23. MAPPING ON SYBASE POWERDESIGNER DWH RELATED MODELS AND FEATURES23 – Company Confidential – June 18, 2012
  24. 24. MAPPING ON SYBASE POWERDESIGNER Data Models24 – Company Confidential – June 18, 2012
  25. 25. MAPPING ON SYBASE POWERDESIGNER Architecture and Requirements25 – Company Confidential – June 18, 2012
  26. 26. MAPPING ON SYBASE POWERDESIGNER DHW related features – Dependency Matrix – What  Two dimensional hierarchical matrix  Present, review and create/delete links of particular kind between two artifacts – Any model, any diagram, any two artifacts – Indirect (two or more links) dependency, drilling  Hierarchy of objects on row/column, Copy to CVS – Reasoning  Full, rich, useful dependency analysis (network)  EAM to understand and present dependency between Data Assets and Data Providers and Consumers,  PDM to create mapping overview  DMM to present actual source to target dependencies26 – Company Confidential – June 18, 2012
  27. 27. MAPPING ON SYBASE POWERDESIGNER DHW related features – Mappings – What  Modeling Connection between objects – Mapping with transformation (O/R, R/R, O/O) – Generation (Generate Mappings)  Wizard to convert mappings into Transformation Task (ILM)  Mapping Editor – Reasoning  Data Flow specification  Relational to Multidimensional, – DWH2MDM  Relational to Relational, – Source2Target  Any descriptive dependency – Federation concept (not Replication or ETL)27 – Company Confidential – June 18, 2012
  28. 28. MAPPING ON SYBASE POWERDESIGNER DHW related features – Impact/Lineage Analysis – What  Impact Analysis – consequences of the change  Lineage Analysis – objects forming the basis for object  Temporary View for Review  IAM for permanent view, snapshot (Drilling, Exploring)  Analysis Rules changeable (Impact/Lineage), – Reasoning  Change Management evaluation, estimation and planning, – To asses change impact before it happens (costs, time, resources)  Snapshots – Development points, different version of system,  Meta Data BI – To explore meta-data set, discover implicit dependencies,28 – Company Confidential – June 18, 2012
  29. 29. DEMONSTRATION EXAMPLE29 – Company Confidential – June 18, 2012
  30. 30. REFERENT INFORMATION ARCHITECTURE MAIN LAYERS OF INFORMATION ARCHITECTURE30 – Company Confidential – June 18, 2012
  31. 31. REFERENT INFORMATION ARCHITECTURE Recommended architecture - example31 – Company Confidential – June 18, 2012
  32. 32. REFERENT INFORMATION ARCHITECTURE Data Access Layer - recommendations – Set of processes, tools, activities and models:  Data extraction (E) from operational system to DWH,  Transformation to suitable shape (T), – Data cleansing, consistency check, integrity, – Translation from operational to enterprise format, – Enterprise DWH data structure is inevitable different then operational,  Data loading into DWH (L), – Recommendations :  Understand OS structure, rules and dynamics,  Dynamics of data refresh rate should be realistic,  Changed Data Capture Algorithm: Intrusive/Non Intrusive,  Apply effective ETL/ELT, use staging area,  Document everything - very intensive and complex changes, – Mappings between Data Sources and DWH Destination,32 – Company Confidential – June 18, 2012
  33. 33. REFERENT INFORMATION ARCHITECTURE DWH Layer and Recommendations – DWH, Staging, Trash and ODS  Common view on enterprise data, regardless of how/who will use it,  Unification offers flexibility in how the data is later interpreted  A stable source of historical information,  Efficient accommodation of a data explosion (growth),  Supply data for Analytical layer on required granularity,  Trash and Alerts&Matching to jump over initial cleansing (blocker), – Recommendations:  Use pre-packaged solution and existing experience,  Use relational and multidimensional modeling (document all),  Relational to address performance issues, follow OS paradigm,  Multidimensional to present later Business Universe for Analytics,  Use views to transform and map Relational to Multidimensional and back,33 – Company Confidential – June 18, 2012
  34. 34. REFERENT INFORMATION ARCHITECTURE Delivery Layer and Recommendations – Set of processes, tools, activities and models:  Selection of data subset to be delivered,  Reorganization (format) of the data to be delivered, – Aggregation, Summing, Counting, additional classification, – Data transformation (Date, Time), slowly changing dimensions  Transform DWH structures to “Business Universe”, – Facts, Measurements, Dimensions, Hierarchies, Business lang. abstraction  Granularity accordingly to the End User needs, – Recommendations:  Use multidimensional modeling,  Model transformation/mappings between DWH and Data Mart(s),  Extend multidimensional model with required BI meta-data,  Define refresh rate on the basis on user needs (constraints),  Aggregations are difficult for incremental update34 – Company Confidential – June 18, 2012
  35. 35. REFERENT INFORMATION ARCHITECTURE MDM Layer and Recommendations – Data marts  DWH derivatives, provide the business community answers to asked questions and strategic analysis,  Tailored for a particular capability or function of enterprise,  Vertically organized and bounded to one business function,  Organized in multidimensional structure,  OLAP – On line Analytical Processing – ROLAP/MOLAP – Recommendations:  Choose proper storage technology (ROLAP/MOLAP), – Special storage may not be standard and may narrow your choices, – Common storage may not be performative enough,  Choose Virtual Marts as basement of Analytics,  Use separate V. Mart for separate business concerns,  Adjust to BI meta-data requirements, use automated access,35 – Company Confidential – June 18, 2012
  36. 36. INFORMATION ARCHITECTUREFOR DWH PROJECTSQUESTIONS?RUAIRI PRENDIVILLESENIOR CONSULTANT , SYBASE (UK)JUNE I4TH, 2012, ISTANBUL

×