Successfully reported this slideshow.

Describing the Organisation Data Landscape

10

Share

Loading in …3
×
1 of 87
1 of 87

Describing the Organisation Data Landscape

10

Share

Download to read offline

Outlines an Approach to Describing the Organisation Data Landscape to Assist with Data Transformation Analysis and Planning

The Data Landscape is a representation of the organisation’s data entities and their relationships, interfaces and data flows. Data entities are data asset components that perform data-related functions, from data storage to data transfer and data processing within the Data Landscape.

The objective of developing a Data Landscape model is to define an approach for formally and exactly defining the operation and use of data at a high-level within the organisation and to plan for future changes. It allows the enterprise data fabric to be defined and modelled.

Creating a data landscape view is important as data underpins the operation of information technology solutions and business processes. Data breathes life into solutions as its flows through the organisation. The optimum and most cost-effective design of the data landscape is therefore important. Similarly, solutions that are developed or acquired and deployed on the data landscape

The nature of the organisation data landscape is changing as organisations are undergoing a data transformation.

Outlines an Approach to Describing the Organisation Data Landscape to Assist with Data Transformation Analysis and Planning

The Data Landscape is a representation of the organisation’s data entities and their relationships, interfaces and data flows. Data entities are data asset components that perform data-related functions, from data storage to data transfer and data processing within the Data Landscape.

The objective of developing a Data Landscape model is to define an approach for formally and exactly defining the operation and use of data at a high-level within the organisation and to plan for future changes. It allows the enterprise data fabric to be defined and modelled.

Creating a data landscape view is important as data underpins the operation of information technology solutions and business processes. Data breathes life into solutions as its flows through the organisation. The optimum and most cost-effective design of the data landscape is therefore important. Similarly, solutions that are developed or acquired and deployed on the data landscape

The nature of the organisation data landscape is changing as organisations are undergoing a data transformation.

More Related Content

More from Alan McSweeney

Describing the Organisation Data Landscape

  1. 1. Describing the Organisation Data Landscape Outlines an Approach to Describing the Organisation Data Landscape to Assist with Data Transformation Analysis and Planning Alan McSweeney http://ie.linkedin.com/in/alanmcsweeney
  2. 2. Describing the Organisation Data Landscape Page 2 Contents Introduction, Purpose and Scope ........................................................................................................... 5 What is the Organisation Data Landscape? ........................................................................................... 6 Data Landscape Definition Principles.................................................................................................... 9 Benefits and Uses of Data Landscape Approach................................................................................... 10 Data Landscape Concepts.................................................................................................................... 10 Data Zones ...................................................................................................................................... 11 Data Entity Types........................................................................................................................... 14 Data Entity Relationships ............................................................................................................... 22 Data Interfaces and Data Flows....................................................................................................... 24 Levels of Descriptive Detail ............................................................................................................. 34 Data Landscape Data Model................................................................................................................ 35 Core Data Landscape Data Model .................................................................................................... 35 Extended Data Landscape Data Model ............................................................................................ 36 Data Entity Components and Functions ...................................................................................... 38 Data Entity Attributes ................................................................................................................ 39 Data Security and Data Transformation ...................................................................................... 45 Data Entity Contents................................................................................................................... 47 Application Group or Service ........................................................................................................... 49 Data Processes and Capabilities........................................................................................................... 53 Data Process Framework ................................................................................................................. 53 Using a Data Process Framework to Assess the Health of the Data Landscape................................. 59 Data Maturity Models...................................................................................................................... 60 Business Functions and Business Processes .......................................................................................... 66 Data Landscape Model and Enterprise Data Model.............................................................................. 67 Using the Data Landscape Model Approach – Some Sample Scenarios................................................. 72 Move Data Entities to the Cloud...................................................................................................... 75 Implement Cloud-based Data Analytics Capability .......................................................................... 77 Move Test Environments to the Cloud ............................................................................................. 79 Outsource IT Infrastructure............................................................................................................. 81 Implement Backup and Recovery as a Service ................................................................................. 83 Implement Disaster Recovery as a Service ....................................................................................... 85
  3. 3. Describing the Organisation Data Landscape Page 3 List of Figures Figure 1 – Changing Organisation Data Landscape................................................................................ 7 Figure 2 – Information and Data Architecture in a Wider Information Technology Architecture Context .............................................................................................................................................................. 8 Figure 3 – Layered View of Data Landscape Data Zones...................................................................... 12 Figure 4 – Island View of Data Landscape Data Zones......................................................................... 12 Figure 5 – Additional Data Zones ........................................................................................................ 13 Figure 6 – Data Entity Types in Data Zones........................................................................................ 15 Figure 7 – Data Zone and Data Entity Type Example with Cloud and Outsourced Service Providers... 21 Figure 8 – Sample Data Entity Relationships ...................................................................................... 23 Figure 9 – Direct Source Push Data Flow ............................................................................................ 25 Figure 10 – Direct Target Pull Data Flow............................................................................................ 26 Figure 11 – Indirect Source Push Target Pull Data Flow ..................................................................... 26 Figure 12 – Indirect Source Push Target Pull Data Flow ..................................................................... 26 Figure 13 – Indirect Source Pull Target Push Data Flow ..................................................................... 26 Figure 14 – Indirect Source Push Target Push Data Flow.................................................................... 27 Figure 15 – Transformation Source Pull Target Pull Data Flow........................................................... 27 Figure 16 – Transformation Source Push Target Pull Data Flow.......................................................... 27 Figure 17 – Transformation Source Pull Target Push Data Flow.......................................................... 27 Figure 18 – Transformation Source Push Target Push Data Flow ........................................................ 28 Figure 19 – Transformation With Multiple Target Pushes.................................................................... 28 Figure 20 – Sample Data Interfaces and Data Flows for Data Entity Types......................................... 29 Figure 21 – Example of Single Extended Data Flow across a Number of Data Entities ........................ 31 Figure 22 – Splitting Sample Extended Data Flow into Two Separate Data Flows............................... 32 Figure 23 – Simplified Representation of Data Flows........................................................................... 33 Figure 24 – Relationships Between Levels of Data Landscape Model Details........................................ 34 Figure 25 – Data Landscape Core Data Model...................................................................................... 35 Figure 26 – Core and Extended Data Landscape Models ...................................................................... 37 Figure 27 – Data Component Level of Detail Extension to Core Data Model ........................................ 38 Figure 28 – Data Attribute Extensions to Core Data Model ................................................................. 41 Figure 29 – Data Entity Future Options.............................................................................................. 43 Figure 30 – Financial Impact View of Data Landscape ........................................................................ 44 Figure 31 – Security Implications of Data Transformation................................................................... 46 Figure 32 – Data Content Extensions to Core Data Model.................................................................... 48 Figure 33 – Application/Service Group Extensions to Core Data Model................................................ 50 Figure 34 – Application View of Data Entities..................................................................................... 51 Figure 35 – Data Entities Shared Between Applications ...................................................................... 51 Figure 36 – Application Environments ................................................................................................ 53 Figure 37 – Data Management Processes Views ................................................................................... 55 Figure 38 – Connections Between Data Capability Areas ..................................................................... 58 Figure 39 – Data Process Implementation, Operation and Use Aspects ................................................ 59 Figure 40 – Data Capability Process Assessment Framework ............................................................... 60 Figure 41 – Generic Maturity Model Structure ..................................................................................... 61 Figure 42 – Improving Process Maturity.............................................................................................. 62 Figure 43 – Data Maturity Models ....................................................................................................... 63 Figure 44 – Data Entity Data Capability Process Health View ............................................................ 65 Figure 45 – Business Processes and Interactions with Data Entities..................................................... 66 Figure 46 – Structure of the Enterprise Data Model (EDM) ................................................................. 67 Figure 47 – Sample Subject Area Model............................................................................................... 69 Figure 48 – Sample Data Landscape with Overlaid Subject Area Model Data ...................................... 71 Figure 49 – Sample Organisation Data Landscape ............................................................................... 74 Figure 50 – Data Entity Sample Representation.................................................................................. 75 Figure 51 – Data Landscape: Move Data Entities to the Cloud............................................................. 76
  4. 4. Describing the Organisation Data Landscape Page 4 Figure 52 – Data Landscape: Implement Cloud-based Data Analytics Capability................................. 78 Figure 53 – Data Landscape: Move Test Environments to the Cloud.................................................... 80 Figure 54 – Data Landscape: Outsource IT Infrastructure ................................................................... 82 Figure 55 – Data Landscape: Implement Backup and Recovery as a Service........................................ 84 Figure 56 – Data Landscape: Implement Disaster Recovery as a Service.............................................. 86
  5. 5. Describing the Organisation Data Landscape Page 5 Introduction, Purpose and Scope The Data Landscape is a representation of the organisation’s data entities and their relationships, interfaces and data flows. Data entities are data asset components that perform data-related functions, from data storage to data transfer and data processing within the Data Landscape. Supporting the Data Landscape is a database structure to allow information to be stored, managed extracted and reported on. This is described on page 49 onwards. This means that the Data Landscape is not a static representation of a current or desired future state. It is a dynamic model that can be updated and maintained. It can be used to assess change scenarios. The objective of developing a Data Landscape model is to define an approach for formally and exactly defining the operation and use of data at a high-level within the organisation and to plan for future changes. The outputs from the data landscape model creation process are aimed at both technical and non- technical audiences. The model needs to be sufficiently flexible to include different levels of detail, from a high-level view of data entities and their relationships to detailed descriptive attributes on the contents and processing performed by data entities and their underlying technology components. This approach does not use a formal modelling language other than a relational model (as a form of data database) for the constructs underlying the data landscape. The material contained here represents a set of conceptual and dynamic models designed to allow insights to be obtained on the design, construction, operation and use of the data landscape. As the data landscape model is itself data driven, it can be changed easily and quickly. This note contains the following sections:  What is the Organisation Data Landscape? on page 6 – this outlines the concept and objectives of the data landscape, lists the data-related drivers and the linkages to other information technology architecture practices  Data Landscape Definition Principles on page 9 – this lists some principles to apply to the creation of the data landscape model.  Benefits and Uses of Data Landscape Approach on page 10 – this lists some of the benefits of using the data landscape approach.  Data Landscape Concepts on page 10 – this introduces the concepts that underpin the data landscape approach.  Data Landscape Data Model on page 35 – this describes the core and extended elements of the data landscape model  Data Processes and Capabilities on page 53 – this describes data processes, capabilities and data life stages and their possible use to assess the health and status of the data landscape  Business Functions and Business Processes on page 66 – this discusses an extension of the data landscape model to incorporate details on business processes associated with data processing.
  6. 6. Describing the Organisation Data Landscape Page 6  Data Landscape Model and Enterprise Data Model on page 67 – this outlines an extension to the data landscape model in to include elements of an Enterprise Data Model such as the Subject Area Model.  Using the Data Landscape Model Approach – Some Sample Scenarios on page 72 – this contains some examples of using the data landscape model for planning data-related changes. What is the Organisation Data Landscape? The organisation data landscape is a representation of the static data entities and the dynamic relationships and data flows across a wide view of the organisation, including external interacting data components and parties, both current and future. Creating a data landscape view is important as data underpins the operation of information technology solutions and business processes. Data breathes life into solutions as its flows through the organisation. The optimum and most cost-effective design of the data landscape is therefore important. Similarly, solutions that are developed or acquired and deployed on the data landscape The nature of the organisation data landscape is changing as organisations are undergoing a data transformation:  The data landscape has been broadened and there are more data entities that form part of the extended organisation data landscape as more applications are moved to cloud service providers and as cloud platforms are used for providing additional facilities not currently present in organisations such as data analytics and machine learning  There is a wider range of data entities as the data landscape increases in complexity  There are more data entity types and data-related capabilities, especially in the areas of advanced data analytics  There are more data demands within the organisation especially in the areas of analytics These developments co-exist with other more general data-related trends that include:  Greater volumes of operational data from increasing numbers of different sources and providers  Greater volumes of derived data  More data sources both internal and external to the organisation  Data in larger numbers of different formats  Data with wider range of contents  Data being generated at different rates  Data being generated at different times  Data being generated with varying degrees accuracy, reliability and greater fuzziness  Data that changes constantly  Data that is of different utility and value
  7. 7. Describing the Organisation Data Landscape Page 7 Figure 1 – Changing Organisation Data Landscape The data landscape approach aims to understand and handle these complexities in order to enable the organisation move from its current state to a target future state. It allows options to be explored and understood. It facilitates the planning and organisation required to achieve this change. The creation of an organisation data landscape is not an end in itself. It is constructed to add value to data architecture-related activities, provide insight, assist with resolving issues and in planning data- related changes. The organisation’s data landscape and the work of the data architect in developing it evolves in line with other information technology architectural practices within the organisation that can involve some or all of the following logical roles:  Enterprise Architecture that defines, develops, extends and manages the implementation and operation of the overall IT delivery and operation framework including standards and solution development and acquisition.  Solution Architecture that designs and oversees the implementation of a portfolio of IT solutions that translate business needs into operable and usable systems that comply with standards.  Application Architecture that defines application architectures including development, sourcing, deployment and integration.
  8. 8. Describing the Organisation Data Landscape Page 8  Business Architecture that defines and manages the implementation of IT solutions and related organisation changes needed to implement business strategy and objectives.  Service Architecture that designs and oversees the implementation of service processes and supporting technologies and systems to ensure the successful operation of IT solutions including outsourced supplier management framework.  Security Architecture that designs data and system security processes and systems to ensure the security of information and systems across the entire IT landscape.  Technical Architecture that translates solution designs into technical delivery, acting as a bridge between solution architecture and the delivery function and designing new delivery approaches.  Infrastructure Architecture that designs application, communication and data infrastructures to operate the portfolio of IT solutions. The data architect does not work in isolation to these other architectural disciplines. While the data architect needs to focus on the core work of data architecture, the work should be part of the wider overall organisation’s IT architecture. The data architect needs to (necessarily) balance narrowly (and selfishly) focussing on pure data work with the broader needs of other IT architecture disciplines. The results of the data landscape model should be shared with other members of the wider IT architecture team. Figure 2 – Information and Data Architecture in a Wider Information Technology Architecture Context The data landscape is an integrated view of all data entities within and outside the organisation. It captures a larger and deeper view of data and the data technologies, processes and capabilities within and outside the organisation. This approach is designed as one tool to allow the data architect perform the role of ensuring the success of current data operations while planning for adopting changes and new technologies.
  9. 9. Describing the Organisation Data Landscape Page 9 This data operations views captures key data entities, their relationships, data flows and the associated data capabilities and their supporting processes. The objective is not to represent all information technology components or applications, but just those that are explicitly related to the processing of data in its widest sense. Server infrastructure used to host data processing applications is not explicitly represented. Similarly, security infrastructure such as web proxies, firewalls, security appliances and user directories need not be shown unless doing so adds value to describing, understanding and planning the data landscape. Data Landscape Definition Principles The data landscape model creation process must be governed by a number of principles:  Less is More – create a model that is just detailed and complex enough to allow results to be generated. The more detail that is added to the model, the greater its complexity becomes. Usability and the ability to interpret the model to generate insight and value are reduced. While the amount of detail that constitutes too much is undefined and subjective, the model should nonetheless be kept simple. Too much detail, especially at the early stage, will kill the data landscape creation process.  Self-Descriptive – the data landscape model should be as self-descriptive as possible. The model should be easy to understand and require as little additional knowledge as possible.  Simplicity of Representation – the meaning should be immediately obvious. Time spent explaining how to interpret the model is a waste of time that could be spent on using the model for planning and decision-making. Information can be filtered to control complexity.  Consistency – the information representation approach should be consistent across all presentation instances and types.  Utility – one measure of the usability of the model is that it is useful and is used. One objective of modelling is to aid understanding, insight, planning and problem determination and resolution.  Results-Focussed – the model is not an end in itself but a means to an end. Too much analysis is implicitly inward- and backward-looking. The model should be forward-looking, looking to assist with the resolution of problems and in planning and defining the future data landscape. Too much time and effort can be spent of gathering detail that is not useful or relevant means that less will be available to devote to value-adding activities. Documenting the existing data entity landscape can be useful to determine the gaps that must be filled.  Relevance – the model should only contain what is relevant. Irrelevant detail should not be added. These principles are inexact but they should nonetheless be considered when creating any data landscape model.
  10. 10. Describing the Organisation Data Landscape Page 10 Benefits and Uses of Data Landscape Approach The landscape is only as useful as the information it contains and the accuracy and currency of that information and the ability to present and use the information. The level of detail that is gathered about the data landscape governs the type of detailed processing and analysis that can be performed. More information means more maintenance of that information. The usefulness and usage will be reduced if the information is not current. Information should be collected at a high level initially. More detail can be added later. Only sufficient information that is needed to add value should be collected and input into the model. The objective is to describe the present in order to map, plan and understand the future, identify gaps, consider options and optimise future configurations. Creating the data landscape view requires an audit of the existing data entities static and dynamic data relationships and flows. It can be a once-off or a continuous engagement: once-off to assist with specific planning activities or continuous to allow the state of the data landscape to be constantly reviewed. The data landscape is a representation of the way in which the organisation currently and how it would like in the future to generate, receive, process, use and disseminate data. The approach will allow changes to be planned and their requirements and impacts to be understood. It will allow data selection, design and deployment options to be explored and opportunities to be assessed, their scope understood, their impacts identified and data architecture and technology alternatives be explored. It can be used to assist with understanding, mapping and planning an organisation’s data transformation and assist in moving to a more data operations-oriented organisation. It can be used to identify opportunities for improvement, simplification and automation. Gaps and missing data capabilities and facilities and capabilities can be identified. Data Landscape Concepts The data landscape model incorporates a number of concepts:  Data Zones – these are groupings of physically closely located data entities. The data zones represent major clusters of or containers for data entities that are physically and/or logically close to one another. Data zones do not represent objects that perform processing. The data entities located within data zones perform the data related activities.  Data Entity Types – these are types of data source, storage, transformation, processing or transfer, components. Data entities perform data-related work across the spectrum of actions and events. Essentially a data entity is a hardware or software technology components involved in any form of data processing. Data entity types are associated with data zones.  Data Entities – these are data assets that are involved in the storage, processing and transfer of organisation data, in the widest sense. Data entities have a type and are located in Data Zones.  Data Entity Relationships – these are connections or associations between data entities. These relationships can be loosely or exactly defined.
  11. 11. Describing the Organisation Data Landscape Page 11  Data Interfaces and Data Flows – a data interface is a specific way a data entity can provide or accept data. A data flow is a link between two (or more) data entity interfaces where there are at least two endpoints: a source and a target.  Levels of Descriptive Detail – these define types and amounts of information to be provided ranging from initial foundational information to details on individual data elements within a data entity.  Data Entity Type Attributes – these are entity-specific attributes that contain descriptive information.  Data Capabilities and Processes – these are sets of activities commonly and repeatedly performed to generate specific results within the context of the data landscape. Data Zones The data landscape model contains a number of data zones, such as:  Insecure External Organisation Presentation And Access – this represents a location where publically accessible data entities reside. These entities are regarded as insecure and/or untrusted.  Secure External Organisation Participation and Collaboration – this is a location outside the physical organisation boundary where data entities that are provided by or two trusted external parties reside.  Secure External Organisation Access – this zone contains data entities that enable secure access from outside the organisation.  Organisation – this data zone represents the entire organisation and it contains all the locations and business units or functions within the organisation.  Central Data Infrastructure – this contains the central data applications and their associated data.  Business Unit/Location Data Infrastructure – this is an individual organisation business unit or location and the data entities it contains. These are shown in the following diagram.
  12. 12. Describing the Organisation Data Landscape Page 12 Figure 3 – Layered View of Data Landscape Data Zones In this diagram, higher-level data zones are shown as encompassing and surrounding lower-level ones. This is just one possible representation of the logical layering of these data zones, from central data infrastructure to wider zones that ring the organisation. The data zones could also be represented as islands in the following view: Figure 4 – Island View of Data Landscape Data Zones
  13. 13. Describing the Organisation Data Landscape Page 13 The data landscape model can be extended to include further data zones if necessary. The following diagram shows additional data zones explicitly represented. Figure 5 – Additional Data Zones These additional data zones are overt representations of locations for organisation data entities located outside the core organisation but effectively part of a stretched data landscape. These further data zones are:  Co-Located Data Infrastructure – this represents organisation data entities that are logically part of the organisation but physically located within a co-located facility.  Outsourced Service Provider Data Infrastructure – this represents data entities that are used by the organisation but are provided or use technology infrastructure provided by an outsourcing service provider.  Cloud Service Provider Data Infrastructure – this represents organisation data entities of various types (applications and their associated data stores, data technology infrastructure and data entities implemented on it or platforms used to create data applications and store their data) provided by cloud service providers. There could be several of each of these zones, one for each service provider the organisation uses. In the diagram above, the data entities that are represented in the Central Data Infrastructure zone level can exist at the Organisation zone. Central organisation data entities can also be located in an Organisation Location/Unit zone defined as a container for that purpose.
  14. 14. Describing the Organisation Data Landscape Page 14 Data Entity Types These represent types of data entities that can reside in data zones. Data entities are assigned a type. The following diagram shows one view of a possible list of data entity types located in the concentric data zone view show in Figure 3 on page 12.
  15. 15. Describing the Organisation Data Landscape Page 15 Figure 6 – Data Entity Types in Data Zones
  16. 16. Describing the Organisation Data Landscape Page 16 This diagram shows the data entities types using the concentric data zone view shown in Figure 3 on page 12. This list is neither complete nor definitive. There are many ways of representing data entities types of which this is one. The objective here is to have a consistent way of representing key data entities. Once this is done, the relationships, interactions and data flows between data entities can be specified. There will be many actual instances of each of these data types. The initial set of data entity types in this view are: Data Zone Data Entity Type Data Entity Type Description Insecure External Organisation Presentation And Access 1 External Public Mail The organisation may use the services of external mail providers, either formally or informally (through some form of shadow IT). 2 Public Web Site The organisation will have a public web site that will, at the very least, contain static information about the organisation. Content may be managed and changes and updates published using a content management solution. It may contain extracts of information contained in operational systems. It can contain links to applications that can process data. The web site may also accept interactions, either one-way or two-way, such as commercial transactions. The organisation may collect web site usage information to understand how users are interacting with it. 3 Social Media Platforms The organisation may utilise some of the many social media platforms to perform functions such as presenting data to the public, interacting and transacting with the public and customers and sharing data with customers, employees and partners. These interactions and transactions may require access to internal operational systems. Content may be managed and changes and updates published using a content management solution provided by the social media platform. These platforms may also be a source of interaction metadata that the organisation may access and process. 4 Public Mobile Apps The organisation may publish apps that allow the public, customers, suppliers and other parties view information and interact and transact with it. These interactions and transactions may require access to internal operational systems. Content may be managed and changes and updates published using a content management solution. The apps will need to be updated and thee updates will need to be pushed to the application platform and from these to user devices. These platforms may also be a source of interaction metadata that the organisation may access and process. 5 External Organisation Public Data Sources These represent public and insecure sources of data, structured or unstructured either being supplied to (PUSH) or available to the organisation (PULL). 6 External Organisation Public Data Targets These represent public and insecure targets to which the organisation supplies or transmits data, structured or unstructured, either being supplied to the target (PUSH) or available to the target to be retrieved (PULL). 7 External Organisation Public Data Sharing The organisation may share data with third-parties or among its own personnel using facilities provided by public data sharing platforms. 8 Public Data Stores The organisation may store data on public data storage platforms.
  17. 17. Describing the Organisation Data Landscape Page 17 Data Zone Data Entity Type Data Entity Type Description 9 Public External Data Devices The organisation may send data to or receive data from externally located public (not secured) devices. Secure External Organisation Participation and Collaboration 10 External Private Mail These represent private mail services hosted securely outside the organisation. 11 External Data Sources These represent private and secure sources of data, structured or unstructured either being supplied to (PUSH) or available to the organisation (PULL). 12 External Data Targets These represent private and secure targets to which the organisation supplies or transmits data, structured or unstructured, either being supplied to the target (PUSH) or available to the target to be retrieved (PULL). 13 External Secure Data Sharing These represent private and secure data sharing facilities. 14 External Secure Data Stores These represent private and secure data storage facilities. 15 Collaboration Solutions The organisation may collaborate internally and with external parties securely using the services of collaboration solutions, either developed or acquired or hosted within the organisation or externally. 16 Transaction Partners The organisation may transact with or use the services of providers of transaction services. These can include suppliers such as payment service providers, information service providers, process outsourcing or supply chain service providers. These transactions will involve the exchange of data. 17 Private External Data Devices The organisation may send data to or receive data from externally located private secure devices. Secure External Organisation Access 18 Secure External Access to Data This represents the facility through which the external secure data entities are accessed and along which data passes. 19 External Data Communications This represents a channel through which the external data entities are accessed and along which data passes. 20Edge Devices These receive data from other data entities such as Private External Data Devices and Public External Data Devices and optionally process, concentrate or aggregate it before transmitting that data onwards. 21 Hosted Data Infrastructure This represents hosted (cloud-based) data infrastructure. This is a high-level representation of what could be a set of data entities that would be expanded using a structure such as shown in Figure 7 on page 21 where the constituent data entity types are explicitly shown in a separate explicitly represented hosted provider data zone. 22 Externally Co- Located Data Infrastructure This represents data infrastructure located in a co-location facility. This is a high-level representation of what could be a set of data entities that would be expanded using a structure such as shown in Figure 7 on page 21 where the constituent data entity types are explicitly shown in a separate explicitly represented co-location provider data zone. 23 Externally Co- Located Outsourced Data Infrastructure This is a placeholder for data infrastructure that the organisation has chosen to place in a co-location facility. This can be expanded to include more details on the data entities using a structure such as shown in Figure 7 on page 21 where the constituent data entity types are explicitly shown in a separate explicitly represented outsourcing provider data zone.
  18. 18. Describing the Organisation Data Landscape Page 18 Data Zone Data Entity Type Data Entity Type Description 24 Data Input to Externally Hosted Data Processing Applications This represents data manually entered into applications that are hosted externally. Such data inputs can be explicitly represented or left implicit. 25 Externally Hosted Data Processing Applications This is a placeholder for applications that the organisation uses that are hosted externally by service providers. This can be expanded to include more details on the data entities using a structure such as shown in Figure 7 on page 21. 26 Externally Hosted Data Processing Applications Data Stores This is a placeholder for an explicit representation of the data stores used by applications that the organisation uses and that are hosted externally by service providers. Such data stores can be left implicit. Given that many service providers has a changing model that includes a data storage/data usage component, their explicit representation allows this to be expressed. This can be expanded to include more details on the data entities using a structure such as shown in Figure 7 on page 21. 27 Backup and Recovery as a Service This is a placeholder for an explicit representation of the specific externally-provided service for backup and recovery of data entities that are located on the organisation’s premises and other locations. 28 Disaster Recovery as a Service This is a placeholder for an explicit representation of the specific externally-provided service for disaster recovery of data entities that are located on the organisation’s premises and other locations. 29 External Data Analytics Services This is a placeholder for any applications that the organisation uses that perform data reporting and analysis functions and that are hosted externally by service providers. This can be expanded to include more details on the data entities using a structure such as shown in Figure 7 on page 21. Organisation 30 External Secure Data Communications This denotes the data communications links required to allow data entities that are co-located or hosted externally to be accessed from within the organisation and for data to be transferred between the data zones. 31 Inter Site/Unit Data Management This represents the possible set of management procedures and tools to enable data access and movement between organisation locations that are physically separated from each other. 32 Inter Site/Unit Data Communications This signifies the data communications links between organisation locations that are physically separated from each other and other data communications-related facilities (such as WAN data compression). 33 Inter Site/Unit Data Replication This represents the capability to replicate some or all data entities to a second site. This could be represented separately or be a separate instance of the data entity type Inter Site/Unit Data Management. Central Data Infrastructure 34 Document Sharing and Collaboration This denotes a type of data application that is used to share and allow collaboration on documents. 35 External Content Data Management and Publication This denotes a type of data entity that provides facilities to create, manage and publish content to externally-facing data entities such as Public Web Site or Public Mobile Apps. Organisations may take a COPE (Create Once Public Everywhere) approach to the management of content and its publication across multiple channels and platforms.
  19. 19. Describing the Organisation Data Landscape Page 19 Data Zone Data Entity Type Data Entity Type Description 36 Data Input to Data Processing Business Applications This denotes data manually entered into on-premises business applications. Such data inputs can be explicitly represented or left implicit. Explicit representations are useful in that they indicate problems with such data entry such as poor data quality checks in the associated business applications. 37 External Data Receipt And Access Control This signifies data entities that are the target for data being received from external entities. Data resides here before it is transmitted to or retrieved by its processing entities. 38 External Data Transmission And Access Control This signifies data entities that are the target for data being sent to external entities. Data resides here before it is transmitted to or retrieved by its target external entities. 39 Document and Record Management Systems These represent a type of data entity that provides document and records management facilities. 40 Internal Email Solution This represents an on-premises email application. 41 Data Processing Business Applications This signifies a type of data entity that performs line of business data processing. 42 Integration/ Service Bus This represents a type of data entity that enables communication between mutually interacting data entities in a service oriented model. 43 External Data Synchronisation This denotes a type of entity that synchronises data held or shared across data zones. 44Semantic Data This signifies a type of data entity that stores information on and enables the processing of the meaning and structure of data held in other data entities. 45Master Data This denotes a type of data entity that either or both stores master data and performs master data management functions. 46 Data Processing Application Operational Data Stores This denotes data entities that represent data stores used by line of business data entities. 47 Audit Log and Usage and Performance Data This refers to a type of data entity that holds logs, audit and usage data on other data entities. 48 Data Storage Infrastructure This signifies the range of physical data storage infrastructure used by other data entities. 49Metadata This represents a type of data entity that either or both stores metadata and performs metadata management functions. 50Reference Data This represents a type of data entity that either or both stores reference data vocabularies and performs reference data management functions. 51ETL This denotes a data entity type that performs the functions of extracting data from one or more source systems and other data sources, operates on the data and then loads the transformed data into a target. 52Data Stores This represents a generic data storage data entity. 53 Data Service Management This data entity provides facilities to implement and operate service management processes that relate to data operations.
  20. 20. Describing the Organisation Data Landscape Page 20 Data Zone Data Entity Type Data Entity Type Description Tools 54 Unstructured Data Stores/File Shares These data entities enable unstructured document-oriented data to be stored and shared. 55Data Warehouse This refers to an integrated data store that takes data from many operational systems and other sources, cleans, transforms, normalises and standardises it, adds a time dimension and that is used for reporting and analysis. 56Data Reporting This denotes a data entity type that provides a range of tools and facilities that enable data to be reported on, visualised and explored. 57 Data Management Tools This refers to a type of data entity that provides tools and facilities to perform data management and housekeeping functions such as backup and recovery and replication. 58 External Data Analytics Co- ordination and Management This denotes a data entity type that allows data analytics functions and work requests to be allocated to external analytics tools and platforms and that co-ordinates the distribution of work and data and that collects and assembles the results. 59 Document Creation This denotes data entities that are sources of new documents and changes to existing documents. 60Data Marts This represents a subset of data warehouse data aimed at presenting a specific subject-oriented set of data for reporting and analysis. 61Data Analytics This refers to a data entity type that provides a range of tools and facilities that enable the discovery and interpretation of patterns in data and that provides capabilities such as data modelling. 62Data Mining This represents a data entity type that provides a range of tools and facilities that apply statistical techniques to transform data into knowledge and extract meaning from data. 63 Data Entity Infrastructure This signifies the range of physical infrastructure (such as processing, network and others) used by other data entities. The data entity types that are in the Central Data Infrastructure data zone could also be located in other data zones such as:  Co-Located Data Infrastructure.  Outsourced Service Provider Data Infrastructure  Cloud Service Provider Data Infrastructure These are data entity types and not actual data entities. The scenario analysis section uses a simple data landscape with data entities of some of these types shown on page 72. The following diagram shows some of the placeholder data types such as Hosted Data Infrastructure, Externally Co-Located Data Infrastructure and Externally Co-Located Outsourced Data Infrastructure being replaced by explicitly references to their constituent data entity types.
  21. 21. Describing the Organisation Data Landscape Page 21 Figure 7 – Data Zone and Data Entity Type Example with Cloud and Outsourced Service Providers
  22. 22. Describing the Organisation Data Landscape Page 22 These data entity types are generic and independent of any specific technology or set of facilities they provide other than that which is implied by their type. For example, a finance and accounting or HR solution will have a type of Data Processing Business Applications. They can be located in a data zone such as Central Data Infrastructure or Secure External Organisation Access. Data Entity Relationships Entities can have (many) relationships with other entities. These can be one-way – from a source to a target entity – or two-way – between two entities. Relationships can be expressed in active or passive terms: A acts on B or B is acted up by A. Relationships are not necessarily definitive. They can be used to denote informal associations. There can be many different types of entity relationships. These relationships can be characterised in different ways. Relationship Type Description Uses Entity A uses the facilities provided by Entity B Updates Entity A updates data stored or managed by Entity B Creates Entity A creates data that is stored or managed by Entity B Processes Entity A processes data that is supplied by Entity B Transfers From Entity A transfers data from Entity B to Entity C Transfers To Entity A transfers data to Entity B from Entity C Manages Entity A manages Entity B Administers Entity A administers Entity B Transforms Entity A performs transformation actions on data from Entity B Stores Entity A stores data in Entity B Reads From Entity A reads data from Entity B Writes To Entity A reads data to Entity B Publishes To Entity A publishes data to Entity B Replicates Data To Entity A replicates data to Entity B Aggregates Data From Entity A aggregates data from Entity B Collects Information On Entity A collects audit, usage and performance data on Entity B Backs Up Entity A backs up data on Entity B Recovers Entity A recovers data for Entity B Loads Entity A loads data from Entity B Entity relationships are intended to represent connections between entities. Changes in those entities – movement to a different zone as a result of movement to a cloud service provider or an outsourcing arrangement, new entities added, entities aggregated or split, new functionality added – impact the relationships. Understanding the entity relationships means the impact of entity changes can be understood. The following diagram shows some of the possible relationships between entity types.
  23. 23. Describing the Organisation Data Landscape Page 23 Figure 8 – Sample Data Entity Relationships
  24. 24. Describing the Organisation Data Landscape Page 24 Entity relationships can be defined at varying levels of detail and complexity. The amount of definition needs to be directly related to the benefit that will be derived. Entity relationships allow the likely consequences of data landscape changes to be identified. This diagram violates the design principles listed on page 9 because of its level of detail that confuses rather than add insight. Such a diagram obscures rather than enlightens. However, once the data entity relationship information has been entered into the data landscape, it can be filtered to show relationship types or just a subset of relationships. The absence of defined relationships between entities can be used to identify potential problems such as underused or redundant entities and the absence of information that need to be collected. Data Interfaces and Data Flows A data interface is a method of a data entity where it can accept or provide data. Interfaces can be PUSH – where the source data provider entity pushes the data to the target or PULL where the target data entity pulls the data from the source. Data interfaces can have properties such as:  Parameters supported that affect the nature of the data being sent or received  The data transmission or receipt protocols supported or used  The type of security, authentication and encryption used  The data formats supported or required  Restrictions on data volumes A data flow is a path from a data source and its associated data interface to a data target and its associated interface. So a data flow involves two (or more) interfaces. Data flows can be direct – from the source data entity interface to the target data entity interface – or indirect – by way of an interim data entity (such as an (S)FTP server, service bus or data storage location acting as a mailbox). The data flows involved in an ETL tool moving data from one data entity to another could be viewed at two data flows or one. Data flows can also involve a transformation, where the source data is modified before it reaches the target. At a very high level, based on the combinations of these options, there can be ten major types of data flow: Data Flow Description Direct Source Push The source data entity pushes the data to the target entity. Direct Target Pull The target data entity pulls data from the target entity. Indirect Source Pull Target Pull The source data entity handles a pull request from an interim data entity that then provides the data in response to a pull request from the target. Indirect Source Push Target Pull The source data interface pushes the data to an interim location where it remains until the target data interface
  25. 25. Describing the Organisation Data Landscape Page 25 Data Flow Description retrieves it. Indirect Source Pull Target Push Data is pulled from the source data interface by the interim data entity. The data is then pushed to the target. Indirect Source Push Target Push The source data interface pushes the data to an interim location where it is then pushed to the target data interface. Transformation Source Pull Target Pull The source data entity handles a pull request from a transformation data entity that then provides the transformed data in response to a pull request from the target. Transformation Source Push Target Pull The source data interface pushes the data to a transformation data entity location where it remains until the target data interface retrieves it. Transformation Source Pull Target Push Data is pulled from the source data interface by the transformation data entity. The transformed data is then pushed to the target. Transformation Source Push Target Push The source data interface pushes the data to a transformation data entity where the transformed data is then pushed to the target data interface. These types of data flows are concerned with the transfer of data. They exclude details on the handshaking required to initiate the data flow such as authentication and generation and use of temporary session keys. Data flows can have other properties such as:  Data transfer type such as file transfer, message, API or other  Data format  Scheduled or unscheduled  Triggers or events  Frequency if scheduled  The type of transformation(s) performed, if any  Protocol used The following diagram represents a Direct Source Push data flow. Figure 9 – Direct Source Push Data Flow The following diagram represents a Direct Target Pull data flow.
  26. 26. Describing the Organisation Data Landscape Page 26 Figure 10 – Direct Target Pull Data Flow The following diagram represents an Indirect Source Pull Target Pull data flow. Figure 11 – Indirect Source Push Target Pull Data Flow The following diagram represents an Indirect Source Push Target Pull data flow. Figure 12 – Indirect Source Push Target Pull Data Flow The following diagram represents an Indirect Source Pull Target Push data flow. Figure 13 – Indirect Source Pull Target Push Data Flow The following diagram represents an Indirect Source Push Target Push data flow.
  27. 27. Describing the Organisation Data Landscape Page 27 Figure 14 – Indirect Source Push Target Push Data Flow The following diagram represents a Transformation Source Pull Target Pull data flow. Figure 15 – Transformation Source Pull Target Pull Data Flow The following diagram represents a Transformation Source Push Target Pull data flow. Figure 16 – Transformation Source Push Target Pull Data Flow The following diagram represents a Transformation Source Pull Target Push data flow. Figure 17 – Transformation Source Pull Target Push Data Flow
  28. 28. Describing the Organisation Data Landscape Page 28 The following diagram represents a Transformation Source Push Target Push data flow. Figure 18 – Transformation Source Push Target Push Data Flow These types of data flows have a single start and single end point in a single data entity. Data flows can be more complex with, for example, data being sent to multiple targets. Figure 19 – Transformation With Multiple Target Pushes Transformations can consist of multiple data processing steps. For the purposes of documenting the data landscape, this additional information increases complexity while not necessarily adding value in terms of understanding the existing landscape and planning for data transformation changes. The following diagram shows a number of possible data flows across a number of interfaces for a subset of the data entity types shown on page 15.
  29. 29. Describing the Organisation Data Landscape Page 29 Figure 20 – Sample Data Interfaces and Data Flows for Data Entity Types
  30. 30. Describing the Organisation Data Landscape Page 30 The next diagram shows an example of a single extended data flow extracted from the previous diagram. The example relates to data the flow from data generated by external devices to the data being analysed across a number of interfaces, and spanning a number of data zones. In this example, there are 12 data entity types and their interfaces involved in the extended data flow: 1. Public External Data Devices – these collect or provide measurement or telemetry data. The collected data is pushed to a data concentrator. 2. Edge Device – this acts as a data concentrator, receiving data from multiple external data sources such as meters or telemetry units. The data is then pushed to a data access data entity type. 3. External Data Receipt And Access Control – this is a generic data entity type that represents the entry portal for incoming data. The edge device pushes aggregated edge device data to this. 4. Integration/Service Bus – this represents a data entity type that implements or provides service oriented data integration facilities. 5. Data Processing Business Applications – this denotes data entity types that represent the business applications that receive the data from the Integration/Service Bus data entity and process it. 6. ETL – the ETL data entity type may be involved in the extended data flow in a number of ways:  It can receive data from the Integration/Service Bus and pass it the Data Processing Application Operational Data Stores data entity type that represents the data stores of the Data Processing Business Applications data entity type.  It can extract data from the Data Processing Application Operational Data Stores and move it to the Data Warehouse and Data Marts data entity types, after transformation to convert operational data into the subject-oriented data format with a time dimension that these entity types typically require. 7. Data Processing Application Operational Data Stores – these data entities are the functional (rather than infrastructural) data storage component of the corresponding Data Processing Business Applications entity types. The ETL data entity type pulls data from the stored data, transforms it and loads it into the Data Warehouse and Data Mart entity types. 8. Data Warehouse – this represents the data entity type that holds long-term data from operations systems. 9. Data Marts – this signifies data entity types that contain specific subsets of transformed operational data used for specific reporting and analysis purposes. 10. Data Analytics – this denotes a data analytics data entity 11. External Data Analytics Co-ordination and Management – this represents a data entity that manages the allocation of data analytics requests to external (cloud-based) data analytics services and the retrieval of results. 12. External Data Analytics Services – these data entities provide external (cloud-based) data analytics facilities.
  31. 31. Describing the Organisation Data Landscape Page 31 Figure 21 – Example of Single Extended Data Flow across a Number of Data Entities
  32. 32. Describing the Organisation Data Landscape Page 32 This single extended data flow can (and really should) be broken down into a number of specific data flows using data entity interfaces that exist and are used for each distinct purpose. The following diagram shows this sample extended data flow divided into three separate data flows: Figure 22 – Splitting Sample Extended Data Flow into Two Separate Data Flows The three data flows represent separate elements of work:  Data Flow 1 – the collection of data from external data sensors  Data Flow 2 – the population of data stores with different types with sensor data
  33. 33. Describing the Organisation Data Landscape Page 33  Data Flow 3 – the analysis of sensor data The level of detail and the amount of process decomposition applied to a data flow depends on factors such as:  The amount of detail to be represented and the value to be derived from that detail  The need to include details on the data handoffs between each interface and to describe any data transformation that occurs  The value and utility that can be obtained from the level of detail The following diagram shows a simplified representation of these previous sample data flows. In this case, just the main data entities and their interfaces involved in the data flows are shown. Figure 23 – Simplified Representation of Data Flows This version contains a reduced amount of detail when compared with the previous more detailed illustration.
  34. 34. Describing the Organisation Data Landscape Page 34 Levels of Descriptive Detail The data landscape model could be used to hold information at different levels of detail:  Level 1 – Data zone and entities, their types, relationships and interfaces. This is the foundational definition of the data landscape. It identifies the major data entities in each data zone.  Level 2 – Additional details about the data entities, their constituent components, attributes and characteristics. This includes platform details, technologies used including versions and products used including versions.  Level 3 – Assessment of the capability, maturity of data management and service management processes across the landscape. This can contain details on the data management capabilities and processes and the related service management processes and an assessment of their application to and how well they have been implemented and operate across data zones and data entities.  Level 4 – Description of data contents and data processing. This level can contain additional details on the data that is within the scope of the data entity such as datasets, files, tables or other data constructs or data processing steps and activities.  Level 5 – Individual details of data contents. This can contain further levels of detail down to the individual data field level. These levels are not necessary incremental. Figure 24 – Relationships Between Levels of Data Landscape Model Details The purpose of the data landscape view is not to become or replace any existing data dictionary or semantic data function within the organisation by adding a parallel set of information. At its core it is a data architecture planning approach.
  35. 35. Describing the Organisation Data Landscape Page 35 Data Landscape Data Model Core Data Landscape Data Model The data model required to describe the core data landscape is quite simple. The following shows it expressed as a simple Entity-Relationship Diagram. Figure 25 – Data Landscape Core Data Model The core data model is sufficient to provide a helicopter view of the data landscape. This core data model contains the following data elements: Data Landscape Data Model Element Description 1 Data Entity This is used to hold details on the data entities in the data landscape 2 Data Entity Type This holds the types of the data entities 3 Data Zone This holds the data zone where the date entities are located
  36. 36. Describing the Organisation Data Landscape Page 36 Data Landscape Data Model Element Description 4 Data Zone Type This holds the types of the data zones 5 Data Entity Relationships This holds the relationships between entities 6 Data Relationship Type This holds the types of the data relationships 7 Data Interface This holds details on data interfaces 8 Data Interface Type This holds the types of the data interfaces 9 Data Entity Data Interface This links data interfaces to data entities 10 Data Flow This holds details on data flows 11 Data Flow Type This holds the types of data flow 12 Data Flow Steps This holds details on steps within data flows 13 Data Flow Step Type This holds the types of the data flow step The core model is sufficient to describe the primary components of the organisation data landscape and to perform the analysis and planning described above. Extended Data Landscape Data Model The core data landscape model can be extended to allow for the inclusion of other information such as:  Components of data entities that could be used to provide more granular information on their constituents – this is described on page 38.  Attributes of data entities and data zones to describe their characteristics – this is described on page 39.  Data contents that describe details on the data associated with a data entity – this is described on page 45.  Application group that links several data entities into a wider application or service – this is described on page 49.  Events and activities relating to data entities.  Data management and operations processes as they apply to data entities and their status – this is described in more detail on page 53.  Subject area model data concepts (part of the Enterprise Data Model) and which data entities are involved in their processing – this is described in more detail on page 67 These are just examples of the types of extensions that can be performed. Such extensions must add value, utility and insight to justify their use and the amount of work required to populate the data structures. These extensions are not sequential. They can be applied in any order to the core data model. At a high-level, the core and extended data landscape model can be represented as follows:
  37. 37. Describing the Organisation Data Landscape Page 37 Figure 26 – Core and Extended Data Landscape Models Data Landscape Data Model Element Description 1 Data Entities These are instances of data entity types that perform data functions within the data landscape 2 Data Zones These are logical groups of data entities 3 Entities Relationships Entities can be related to each other 4 Data Interfaces Interfaces are date entry or exit points within data entities 5 Data Flows Flows represent movement of data between entity interfaces 6 Data Attributes Attributes are characteristics of zones, entities, interfaces and flows 7 Data Components Entities can be divided into their constituent components 8 Data Capability and Management Processes The data landscape and its constituent entities are implemented and operated using data processes 9 Data Entity Events Events can be recorded against entities 10 Data Entity Data Contents The contents of data entities can be recorded 11 Application or Service Group Entities can be grouped into applications 12 Data Subjects Data subjects can be associated with the processing of data subjects defined in the Subject Area Model 13 Data Security Data entities can have security requirements or implications
  38. 38. Describing the Organisation Data Landscape Page 38 The next four sections show possible extensions to the core data model to describe:  Data entity components and data entity functions – see below  Data entity attributes – see on page 39  Data entity contents – see on page 45  Application or service groups – see on page 49 Data Entity Components and Functions Data components are intended to hold an additional level of detail on the contents and composition of data entities and the functions those components perform. Figure 27 – Data Component Level of Detail Extension to Core Data Model These extended data model elements are: Data Landscape Data Model Element Description 14 Data Component This holds details on data components 15 Data Component Type This holds the types of the data components 16 Data Component Data Entity This contains the data components that a data entity
  39. 39. Describing the Organisation Data Landscape Page 39 Data Landscape Data Model Element Description contains 17 Data Function Thus contains details on data functions 18 Data Function Type This holds the types of the data functions 19 Data Component Data Function This contains the data functions that a data components performs Not all these data elements are required for this data model extension. Data Entity Attributes Data entity attributes can be used to store extended details on data zones, entities, interfaces and flows For example, an attribute called Database Platform could be defined. This can be assigned a list of possible values such as:  IBM DB2  Informix  Microsoft Azure SQL Database  Microsoft SQL Server  MySQL  Oracle  PostgreSQL The Database Platform attribute could then be associated with Data Entity Type of Data Processing Application Operational Data Stores. The Data Entity of Financial System Database can be assigned a Data Entity Type of Data Processing Application Operational Data Stores. The Database Platform attribute of the Data Entity of Financial System Database can then be assigned a value from the list of possible values. The objective of allowing data entity attributes in not to store and manage detailed configuration information. The DMTF (Distributed Management Task Force) maintain and publish a Common Diagnostic Model https://www.dmtf.org/standards/cdm that contains details on a possible set of IT infrastructure specific attributes. There are other examples of detailed data entity attributes from developers of CMDB (Configuration Management Database) software whose data models contain examples of such attributes. These are some instances of CMDB data models such as:  BMC Atrium Common Data Model – https://docs.bmc.com/docs/ac1902/common-data-model- 842265110.html  IBM Tivoli Common Data Model – http://www.redbooks.ibm.com/redpapers/pdfs/redp4389.pdf  Microsoft Operation Manager Data Model – https://blogs.technet.microsoft.com/drewfs/2014/08/17/general-purpose-data-model-for-scom-data- warehouse/
  40. 40. Describing the Organisation Data Landscape Page 40  ServiceNow CMDB Schema Model – https://docs.servicenow.com/bundle/newyork-servicenow- platform/page/product/configuration- management/concept/c_ConfigurationManagementDatabase.html These details are included for information only. The data landscape data model does not need to include this level of detail.
  41. 41. Describing the Organisation Data Landscape Page 41 Figure 28 – Data Attribute Extensions to Core Data Model
  42. 42. Describing the Organisation Data Landscape Page 42 These extended data model elements are: Data Landscape Data Model Element Description 20 Data Attribute This holds details on data attributes that can be assigned to data entities and that will be assigned data entity- specific values 21 Data Attribute Type This holds the types of the data attribute in terms of the type of values the attribute can hold 22 Data Attribute Values This contains the values associated with the data attribute 23 Data Attribute Data Entity Type This holds the data attributes that can are linked to specific data entity types and to which values can be assigned 24 Data Attribute Data Entity Value This hold the value of data attribute for each data entity 25 Data Attribute Data Zone Type This holds the data attributes that can are linked to specific data zone types and to which values can be assigned 26 Data Attribute Data Zone Value This hold the value of data attribute for each data zone 27 Data Attribute Data Interface Type This holds the data attributes that can are linked to specific data interface types and to which values can be assigned 28 Data Attribute Data Interface Value This hold the value of data attribute for each data interface 29 Data Attribute Data Flow Value This holds the data attributes that can are linked to specific data flow types and to which values can be assigned 30 Data Attribute Data Flow Type This hold the value of data attribute for each data flow Not all these data elements are required for this data model extension. Data entity attributes can be used to hold status and planning information about data entities. These could include:  Future plans for the data entity  Status of the underlying technology  Process status and health  Support status  Issues with the data entity  End-of-life date For example, the future plans for data entities could include some or all of the following values that indicate the corresponding actions: 1. Reassemble – combine functionality of solution with other solutions to create new combined solution 2. Redevelop – redevelop the custom application and retain its functionality 3. Reduce – stop using functional elements of the current application while retaining it 4. Refactor – change the internal application structure, design and implementation without changing the external appearance and functionality 5. Rehost – move application to new platform without change 6. Relocate – move the data contained in the data entity to another platform
  43. 43. Describing the Organisation Data Landscape Page 43 7. Repair – resolve problems and issues with the current application while retaining it on the same platform 8. Replace – replace the application with a functionally similar one 9. Replatform – move application to new platform with some limited changes to enable the application run on the new target platform 10. Research – this represents data technologies that are emerging and are being researched and piloted for possible production application 11. Reserve – retain the application but encapsulate access to its functionality via some form of interface 12. Retain – retain the application entirely in its current form 13. Retire – stop using an obsolete application without explicitly replacing it The following diagram shows the rough general location of these options arranged along two axes of future location of the data entity – from existing location to an external cloud or hosted one – and the level of change involved to the data entity – from none to significant. Figure 29 – Data Entity Future Options When this additional status information is available, data entities could then be filtered based on factors such as their future plans. There may be a temptation to create lots of data entity type-specific attributes that can be used to record information about data entities. However, unless these attributes add value to the data landscape model, they should not be added. Once area that could add value is using data attributes to track the cost or financial impact of data entities. This information can then be used to assess the financial impact of various data transformation options. The following diagram shows a possible view of the financial impact of data entities imposed on the sample data landscape on page 72.
  44. 44. Describing the Organisation Data Landscape Page 44 Figure 30 – Financial Impact View of Data Landscape
  45. 45. Describing the Organisation Data Landscape Page 45 Data Security and Data Transformation Data will have security characteristics and requirements in terms of its sensitivity and confidentiality and the impact on the organisation of its loss, from regulatory to financial and reputational. The data attribute extension to the data model can be used to hold security profile information regarding data entities or data subjects processed by those data entities (see details on the subject area model on page 67). In planning data transformations such as those examples listed on page on page 72, the security implications can be identified and assessed if the security attributes has been defined. The following diagram illustrates this.
  46. 46. Describing the Organisation Data Landscape Page 46 Figure 31 – Security Implications of Data Transformation
  47. 47. Describing the Organisation Data Landscape Page 47 Data Entity Contents Data entity contents are intended to hold an additional level of detail on the data contents of data entities. This is separate from data entity components that are intended to represent functional elements of a data entity.
  48. 48. Describing the Organisation Data Landscape Page 48 Figure 32 – Data Content Extensions to Core Data Model
  49. 49. Describing the Organisation Data Landscape Page 49 These extended data model elements are: Data Landscape Data Model Element Description 31 Data Content Type This holds details on data types of data content 33 Data Content Type Data Entity Type This contains details on data content types that can be assigned to data entity types 33 Data Entity Data Content Value This holds the data content values assigned to data content types for specified data entities Not all these data elements are required for this data model extension. Application Group or Service A set of data entities can belong to an application or service. The purpose of this extension to the core data landscape model is to allow data entities be assigned to applications.
  50. 50. Describing the Organisation Data Landscape Page 50 Figure 33 – Application/Service Group Extensions to Core Data Model
  51. 51. Describing the Organisation Data Landscape Page 51 For example, an application that allows external users interact with it may consist of the following data entities: Figure 34 – Application View of Data Entities Common data entities such as those providing infrastructural-related data services can be shared between applications or services. Figure 35 – Data Entities Shared Between Applications
  52. 52. Describing the Organisation Data Landscape Page 52 Being able to group data entities to reflect their involvement and the role they perform in an application or service means that the impact on that application or service can be determined if any of its component data entities change. These extended data model elements are: Data Landscape Data Model Element Description 34 Application/Service This holds details on applications or services that contain data entities 35 Application/Service Type This holds details on an optional set of types of application or service 36 Application/Service Data Entity This links applications or services to data entities. A data entities can belong to more than one application or service 37 Application/Service Role This holds details on the roles that can be assigned to data entities within applications or services 38 Application/Service Role Type This holds details on an optional set of application or service role types 39 Application/Service Data Entity Roles This assigns roles to application or service data entities. A data entities can have more than one role for an application or service 40 Application/Service Environment Type This holds details on an optional set of environment types that can be assigned applications or services 41 Application/Service Data Entity Environment Types This assigns environment types to application or service data entities The environment type data element can be used to identify separate environments for an application. Environment values can be defined such as:  Production  Pre-Production  Operations Acceptance Test  User Acceptance Test  System/Integration Test  Development/Unit Test
  53. 53. Describing the Organisation Data Landscape Page 53 Figure 36 – Application Environments Not all these data elements are required for this data model extension. Data Processes and Capabilities Data Process Framework The data landscape is neither passive nor static. It must be designed, implemented managed, administered and operated through the development and application of a range of processes. The extent of their implementation and application should be part of any description and assessment of the state of the data landscape. The state of these processes and the state of the application to specific entities is one measure of the state of the data landscape. If a means is required to assess the health of the data landscape with respect to its operational state then a structured operational process framework is required against what that assessment can be performed. This section contains notes on defining such an operational process definition and thus assessment framework. The objective here is not to define a complete, exact and rigorous process definition assessment framework. The modelling principles listed on page 9 should be applied here. Complexity is the enemy of quick and useful results. These data-related processes can be grouped and viewed in a number of ways:  Data Service Management Processes – these are data landscape-specific elements of what should be more general information technology service management processes. They are sets of activities performed to create a result. These are concerned with looking after the pure operational aspects of the data landscape (as part of a wider information technology landscape). This is just one view of the key service management processes that apply to the data landscape.
  54. 54. Describing the Organisation Data Landscape Page 54  Data Capability Process Areas – these are data landscape-specific capabilities and the associated processes that actualise their use. These represent skills that will be of varying degrees of importance to each organisation.  Data Life Stages – this is a view of the stages to which data moves as it is being processed by data entities. Not all of these stages apply to all entities. The entire set of stages may span a number of data entities. The stage view applies to an individual data instance, a set of data processed by a specific solution that may use the facilities of multiple data entities. Each of these views describes a different aspect of the processes associated with the data landscape. The service management process view describes how well these general service management processes have been implemented and are operated for entities within the data landscape. The data process area are sets of skills and abilities that must exist and be applied to the design and implementation of data entities. The data life stage view takes a cross-functional perspective on data processing and movement through its life stages
  55. 55. Describing the Organisation Data Landscape Page 55 Figure 37 – Data Management Processes Views
  56. 56. Describing the Organisation Data Landscape Page 56 The service management processes and their applicability to the data landscape are:  Incident Management – handle and manage unplanned interruptions to or reduction in the quality of a data service and to restore normal operation as quickly as possible, minimising the negative impact on business operations.  Problem Management – analyse and determine the root causes of incidents to stop incidents from happening, to eliminate recurring incidents and to minimise the impact of incidents that cannot be stopped.  Event and Alert Management - detect events and alerts that represent significant occurrences to entities, identify them and determine the appropriate actions to take and to collect data for analysis.  Performance and Capacity Management – analyse resource consumption, determine patterns and trends and ensure there are sufficient resources to support the current and projected future operation of data entities.  Service Level Management – agree and define service targets and then ensure these targets are met for data entities.  Asset Management – track data entities though their lifecycle to identify ownership, cost of operation and use and manage upgrade and replacement cycles  Resilience, Availability and Continuity Management – ensure that data entities are available, can resist failure and recover quickly from failure and ensure continuity of operations in the event of the loss of data entities.  Risk Management – identify, assess and control occurrences that could cause loss of or damage to data entities.  Change Management – manage modifications to, additions to or removal of data entities in a controlled manner to avoid disruption to services.  People Management – manage people resources required to administer, manage and operate data entities from a service operations view.  Supplier Management – manage suppliers of services across the duration of the product or service supply contract.  Knowledge Management - manage information and knowledge systems so that personnel have access to the knowledge needed to effectively perform their work and identify the knowledge needed for service delivery. Ideally, there will already be a service management framework in operation within the organisation that will have implemented these service processes more generally. These can then be applied specifically to the data landscape. The data capabilities and their associated processes are:  Data Governance – planning, supervision and control over data management and use, developing data management and use standards and ensuring compliance with data management processes and standards.
  57. 57. Describing the Organisation Data Landscape Page 57  Data Architecture Management – defining data technology standards, defining the approach to managing data assets, use and reuse of and compliance with existing data technology standards, use and reuse of data infrastructural technology solutions.  Data Model Management – creating standard data models of the data that will be collected, create, stored and processed that formally describe the data contents and structures, including metadata and semantic data, integrating, controlling and providing metadata – descriptive data about the underlying operational data, creation of data description standards and the collection, categorisation, maintenance, integration, application, use and management of data descriptions.  Data Security Management – ensuring data privacy, confidentiality and appropriate access to data, managing and implementing data classification and preventing data loss.  Data Solution Design and Implementation Management – ensuring that all the data aspects of the design of information technology solutions are performed to a suitable standard and incorporated into subsequent solution implementation.  Data Operations Management – providing data storage, data operations and service management support from data acquisition to purging. The service management processes listed above could be subsumed into this capability.  Data Master, Reference and Quality Management – managing master versions and replicas, management of master versions of shared data resources to reduce redundancy and maintain data quality through standardised data definitions and use of common data values, defining, monitoring and improving data quality.  Data Audit, Control and Lifecycle Management – managing the definition, collection and analysis of data audit information, using audit information to develop data controls  Data Movement, Integration and Transformation Management – data resource integration, data interfaces and flows, extraction, transformation, movement, delivery, replication, transfer, sharing, federation, virtualisation and operational support, business solution data interfaces and integrations.  Data Location, Synchronisation and Access Management – managing data across storage locations and platforms, both internal and external, synchronising data across platforms and managing and controlling access to data across platforms.  Data Usability Management – ensure usability across all elements of the data landscape, ensuring utility, accuracy, consistency, ease of interpretation.  Data Project Management – supporting and managing the data aspects of projects and solution delivery and handover to production and support.  Data Insight and Presentation Management – creating data warehouses and data marts, implementing reporting, data visualisation and analytics, defining data metrics and performance and results indicators, implementing and operating processes to ensure action is taken based on data insights. These data capabilities are not isolated silos. They are interlinked. This does not mean that they cannot and should be assessed and evaluated separately. The following diagram illustrates some of these data capability linkages.
  58. 58. Describing the Organisation Data Landscape Page 58 Figure 38 – Connections Between Data Capability Areas The interconnectedness of the data capabilities and their underpinning implementation and operating processes illustrates the difficulty of assessing one capability independently of others. It is almost certain that if an organisation is good at any one of these capabilities, it will be good at all of them. There will be three implementation-related aspects to each of these process areas: 1. How well the processes are defined and implemented 2. How well the defined processes are applied, implemented and operated 3. How important and relevant the process These aspects apply to both the process in general and to its application for a specific data entity.
  59. 59. Describing the Organisation Data Landscape Page 59 Figure 39 – Data Process Implementation, Operation and Use Aspects A process measurement and assessment framework that includes all of these elements would be very complex to use. Gaps in process definition and operation in the data landscape may indicate potential problem areas that may require or would benefit from remediation. Using a Data Process Framework to Assess the Health of the Data Landscape The following approach could be used to assess data capabilities across the data landscape. For each data capability, rate the overall importance, implementation status and operation and use status. Then for each data entity, rate each data capability in the same way. This would result in a measurement structure along the following lines:
  60. 60. Describing the Organisation Data Landscape Page 60 Figure 40 – Data Capability Process Assessment Framework This is very complex measurement structure as well as being time-consuming to create, maintain and use. This approach breaches the design principles listed on page 9. While some form of data capability process assessment would be useful in being able to detect potential problems and areas requiring remediation, this approach, without simplification, would be too complex to use and be usable. In terms of the extended data model, the data entity attribute approach described on page 39 could be used to hold the process status/health information. For the purposes of identifying issues at a high-level, this should be sufficient. Data Maturity Models It is not possible to discuss the topic of (data) processes and their assessment without the subjects of their maturity and the use of maturity models being raised. The purpose of this document is not to
  61. 61. Describing the Organisation Data Landscape Page 61 discuss data maturity models in detail. This section covers the topic briefly. There has been a growth in the number of informal and ad hoc maturity models across different aspects of data processes. These models lack the rigour and validation and the detailed assessment framework to support their use. All these maturity models (should) have a common structure:  There is a set of maturity levels on an ascending scale, typically from 1 to 5: 5 - Optimising process 4 - Predictable process 3 - Established process 2 - Managed process 1 - Initial process  Each maturity level has a number of process areas/categories/groupings. Maturity relates to embedding these processes within the organisation.  Each process area has a number of processes.  Each process has generic and specific goals and practices.  Specific goals describe the unique features that must be present to satisfy the process area.  Generic goals apply to multiple process areas.  Generic practices are applicable to multiple processes and represent the activities needed to manage a process and improve its capability to perform.  Specific practices are activities that are contribute to the achievement of the specific goals of a process area. Figure 41 – Generic Maturity Model Structure
  62. 62. Describing the Organisation Data Landscape Page 62 Maturity levels are intended to be a way of defining a means of evolving improvements in processes associated with what is being measured. Figure 42 – Improving Process Maturity These data maturity models have different areas of focus, as shown in the diagram below.
  63. 63. Describing the Organisation Data Landscape Page 63 Figure 43 – Data Maturity Models The data maturity model groups are:  Data Capability Maturity Model – these define a set of general data capabilities that should encompass all the required data competencies and that can be used to measure the organisation’s overall data process maturity.  Data Governance Capability Model – these apply maturity to the subset of data capabilities relating to data governance.  Data Stewardship Capability Model – these apply to the further subset of data governance processes relating to data stewardship – the fitness, quality and usability of data and its metadata,  Data Analytics Maturity Model – these apply to the subset of processes that apply to data analytics activities.  Big Data Maturity Model – these apply to the subset of processes that apply to big data and by association data analytics activities. The following table lists some of the maturity models currently available in these areas. Maturity models come and go with great regularity in the data domain. There are a large number of obsolete and
  64. 64. Describing the Organisation Data Landscape Page 64 unmaintained data maturity models. Many of the models are developed by vendors who use them to sell their products and services rather than the models being independent assessments of actual and relevant organisation data maturity. Data Maturity Model Type Examples Examples Data Capability Maturity Model CMMI Institute Data Management Maturity (DMM) https://cmmiinstitute.com/data- management-maturity DAMA International Data Management Body of Knowledge (DMBOK) https://dama.org/content/body-knowledge EDM Council DCAM Data Management Capability Assessment Model https://edmcouncil.org/page/aboutdcamre view Federal Government Data Maturity Model https://www.ntis.gov/assets/FDMM.pdf MIKE2.0 (Method for an Integrated Knowledge Environment) Information Maturity (IM) QuickScan http://mike2.openmethodology.org/wiki/In formation_Maturity_QuickScan Data Governance Capability Mode NASCIO Data Governance https://www.nascio.org/EA/ArtMID/572/A rticleID/198/Data-Governance-Managing- Information-As-An-Enterprise-Asset- Part-I-An-Introduction ARMA The Information Governance Maturity Model https://www.arma.org/page/IGMaturityM odel Data Stewardship Capability Model NOAA Data Stewardship Capability Model https://geo- ide.noaa.gov/wiki/index.php?title=Data_S tewardship_Maturity_Questionnaire_(DS MQ)_User%E2%80%99s_Guide Data Analytics Maturity Model Gartner Data Analytics Maturity Model https://www.gartner.com/smarterwithgart ner/take-your-analytics-maturity-to-the- next-level/ Data Science Maturity Model https://blogs.oracle.com/r/a-data-science- maturity-model-for-enterprise-assessment- part-1 Big Data Maturity Model CSC http://csc.bigdatamaturity.com/ Horton Works http://hortonworks.com/wp- content/uploads/2016/04/Hortonworks- Big-Data-Maturity-Assessment.pdf IBM https://www.ibmbigdatahub.com/blog/big- data-analytics-maturity-model Info-Tech https://www.infotech.com/research/ss/lever age-big-data-by-starting-small/it-big-data- maturity-assessment-tool TDWI https://tdwi.org/pages/maturity- model/big-data-maturity-model- assessment-tool.aspx?m=1 Such maturity models may be useful for specific assessment engagements. But in terms of the overall data landscape a considerably simpler approach is needed. The maturity models listed above are all quite different, have different areas of focus and are both quite detailed as well as not covering the full scope of the data landscape and the processes required to support and operate it. The set of data capabilities listed on page 56 could form the basis of a maturity model. This could then be used to create a data landscape view of data capability process health using a simple traffic light display as shown in the following diagram.
  65. 65. Describing the Organisation Data Landscape Page 65 Figure 44 – Data Entity Data Capability Process Health View
  66. 66. Describing the Organisation Data Landscape Page 66 Business Functions and Business Processes The data entities within the organisation data landscape, grouped into applications, are used to operate business processes. The data that flows into and out of the data entities is used by these business processes. Simplistically, business processes and their interactions with data entities can be represented as follows: Figure 45 – Business Processes and Interactions with Data Entities The elements of this are: 1. The business process consists of a series of tasks performed in a sequence. 2. Business applications are used to assist with the performance of these tasks. Data is entered into those applications, it is modified, new data is generated, data is output and some or all of this data is stored. 3. The business applications consist of sets of data entities that combine to comprise those applications. In the same way as individual entities can be grouped to comprise applications or services as described on page 49, the business processes associated with data entities could be defined. This would then allow a business process view of data entities to be created. Within the data landscape model, such business process information could be useful but it strays from the core purpose of understanding the operation and use of data at a high-level within the organisation and to plan for future changes.

×