This is Part 4 of the GoldenGate series on Data Mesh - a series of webinars helping customers understand how to move off of old-fashioned monolithic data integration architecture and get ready for more agile, cost-effective, event-driven solutions. The Data Mesh is a kind of Data Fabric that emphasizes business-led data products running on event-driven streaming architectures, serverless, and microservices based platforms. These emerging solutions are essential for enterprises that run data-driven services on multi-cloud, multi-vendor ecosystems.
Join this session to get a fresh look at Data Mesh; we'll start with core architecture principles (vendor agnostic) and transition into detailed examples of how Oracle's GoldenGate platform is providing capabilities today. We will discuss essential technical characteristics of a Data Mesh solution, and the benefits that business owners can expect by moving IT in this direction. For more background on Data Mesh, Part 1, 2, and 3 are on the GoldenGate YouTube channel: https://www.youtube.com/playlist?list=PLbqmhpwYrlZJ-583p3KQGDAd6038i1ywe
Webinar Speaker: Jeff Pollock, VP Product (https://www.linkedin.com/in/jtpollock/)
Mr. Pollock is an expert technology leader for data platforms, big data, data integration and governance. Jeff has been CTO at California startups and a senior exec at Fortune 100 tech vendors. He is currently Oracle VP of Products and Cloud Services for Data Replication, Streaming Data and Database Migrations. While at IBM, he was head of all Information Integration, Replication and Governance products, and previously Jeff was an independent architect for US Defense Department, VP of Technology at Cerebra and CTO of Modulant – he has been engineering artificial intelligence based data platforms since 2001. As a business consultant, Mr. Pollock was a Head Architect at Ernst & Young’s Center for Technology Enablement. Jeff is also the author of “Semantic Web for Dummies” and "Adaptive Information,” a frequent keynote at industry conferences, author for books and industry journals, formerly a contributing member of W3C and OASIS, and an engineering instructor with UC Berkeley’s Extension for object-oriented systems, software development process and enterprise architecture.
Presentation on Data Mesh: The paradigm shift is a new type of eco-system architecture, which is a shift left towards a modern distributed architecture in which it allows domain-specific data and views “data-as-a-product,” enabling each domain to handle its own data pipelines.
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
Enterprise Architecture (EA) provides a visual blueprint of the organization, and shows key interrelationships between data, process, applications, and more. By abstracting these assets in a graphical view, it’s possible to see key interrelationships, particularly as they relate to data and its business impact across the organization. Join us for a discussion on how data architecture is a key component of an overall enterprise architecture for enhanced business value and success.
Data Management, Metadata Management, and Data Governance – Working TogetherDATAVERSITY
The data disciplines listed in the title must work together. The key to success requires understanding the boundaries and overlaps between the disciplines. Wouldn’t it be great to be able to present the relationships between the disciplines in a simple all-in diagram? At the end of this webinar, you will be able to do just that.
This new RWDG webinar with Bob Seiner will outline how Data Management, Metadata Management, and Data Governance can be optimized to work together. Bob will share a diagram that has successfully communicated the relationship between these disciplines to leadership resulting in the disciplines working in harmony and delivering success.
Bob will share the following in this webinar:
- Categories of disciplines focused on managing data as an asset
- A definition of Data Management that embraces numerous data disciplines
- The importance of Metadata -Management to all data disciplines
- Why data and metadata require formal governance
- A graphic that effectively exhibits the relationship between the disciplines
Every day, businesses across a wide variety of industries share data to support insights that drive efficiency and new business opportunities. However, existing methods for sharing data involve great effort on the part of data providers to share data, and involve great effort on the part of data customers to make use of that data.
However, existing approaches to data sharing (such as e-mail, FTP, EDI, and APIs) have significant overhead and friction. For one, legacy approaches such as e-mail and FTP were never intended to support the big data volumes of today. Other data sharing methods also involve enormous effort. All of these methods require not only that the data be extracted, copied, transformed, and loaded, but also that related schemas and metadata must be transported as well. This creates a burden on data providers to deconstruct and stage data sets. This burden and effort is mirrored for the data recipient, who must reconstruct the data.
As a result, companies are handicapped in their ability to fully realize the value in their data assets.
Snowflake Data Sharing allows companies to grant instant access to ready-to-use data to any number of partners or data customers without any data movement, copying, or complex pipelines.
Using Snowflake Data Sharing, companies can derive new insights and value from data much more quickly and with significantly less effort than current data sharing methods. As a result, companies now have a new approach and a powerful new tool to get the full value out of their data assets.
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a modern data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. They all may sound great in theory, but I'll dig into the concerns you need to be aware of before taking the plunge. I’ll also include use cases so you can see what approach will work best for your big data needs. And I'll discuss Microsoft version of the data mesh.
This is Part 4 of the GoldenGate series on Data Mesh - a series of webinars helping customers understand how to move off of old-fashioned monolithic data integration architecture and get ready for more agile, cost-effective, event-driven solutions. The Data Mesh is a kind of Data Fabric that emphasizes business-led data products running on event-driven streaming architectures, serverless, and microservices based platforms. These emerging solutions are essential for enterprises that run data-driven services on multi-cloud, multi-vendor ecosystems.
Join this session to get a fresh look at Data Mesh; we'll start with core architecture principles (vendor agnostic) and transition into detailed examples of how Oracle's GoldenGate platform is providing capabilities today. We will discuss essential technical characteristics of a Data Mesh solution, and the benefits that business owners can expect by moving IT in this direction. For more background on Data Mesh, Part 1, 2, and 3 are on the GoldenGate YouTube channel: https://www.youtube.com/playlist?list=PLbqmhpwYrlZJ-583p3KQGDAd6038i1ywe
Webinar Speaker: Jeff Pollock, VP Product (https://www.linkedin.com/in/jtpollock/)
Mr. Pollock is an expert technology leader for data platforms, big data, data integration and governance. Jeff has been CTO at California startups and a senior exec at Fortune 100 tech vendors. He is currently Oracle VP of Products and Cloud Services for Data Replication, Streaming Data and Database Migrations. While at IBM, he was head of all Information Integration, Replication and Governance products, and previously Jeff was an independent architect for US Defense Department, VP of Technology at Cerebra and CTO of Modulant – he has been engineering artificial intelligence based data platforms since 2001. As a business consultant, Mr. Pollock was a Head Architect at Ernst & Young’s Center for Technology Enablement. Jeff is also the author of “Semantic Web for Dummies” and "Adaptive Information,” a frequent keynote at industry conferences, author for books and industry journals, formerly a contributing member of W3C and OASIS, and an engineering instructor with UC Berkeley’s Extension for object-oriented systems, software development process and enterprise architecture.
Presentation on Data Mesh: The paradigm shift is a new type of eco-system architecture, which is a shift left towards a modern distributed architecture in which it allows domain-specific data and views “data-as-a-product,” enabling each domain to handle its own data pipelines.
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
Enterprise Architecture (EA) provides a visual blueprint of the organization, and shows key interrelationships between data, process, applications, and more. By abstracting these assets in a graphical view, it’s possible to see key interrelationships, particularly as they relate to data and its business impact across the organization. Join us for a discussion on how data architecture is a key component of an overall enterprise architecture for enhanced business value and success.
Data Management, Metadata Management, and Data Governance – Working TogetherDATAVERSITY
The data disciplines listed in the title must work together. The key to success requires understanding the boundaries and overlaps between the disciplines. Wouldn’t it be great to be able to present the relationships between the disciplines in a simple all-in diagram? At the end of this webinar, you will be able to do just that.
This new RWDG webinar with Bob Seiner will outline how Data Management, Metadata Management, and Data Governance can be optimized to work together. Bob will share a diagram that has successfully communicated the relationship between these disciplines to leadership resulting in the disciplines working in harmony and delivering success.
Bob will share the following in this webinar:
- Categories of disciplines focused on managing data as an asset
- A definition of Data Management that embraces numerous data disciplines
- The importance of Metadata -Management to all data disciplines
- Why data and metadata require formal governance
- A graphic that effectively exhibits the relationship between the disciplines
Every day, businesses across a wide variety of industries share data to support insights that drive efficiency and new business opportunities. However, existing methods for sharing data involve great effort on the part of data providers to share data, and involve great effort on the part of data customers to make use of that data.
However, existing approaches to data sharing (such as e-mail, FTP, EDI, and APIs) have significant overhead and friction. For one, legacy approaches such as e-mail and FTP were never intended to support the big data volumes of today. Other data sharing methods also involve enormous effort. All of these methods require not only that the data be extracted, copied, transformed, and loaded, but also that related schemas and metadata must be transported as well. This creates a burden on data providers to deconstruct and stage data sets. This burden and effort is mirrored for the data recipient, who must reconstruct the data.
As a result, companies are handicapped in their ability to fully realize the value in their data assets.
Snowflake Data Sharing allows companies to grant instant access to ready-to-use data to any number of partners or data customers without any data movement, copying, or complex pipelines.
Using Snowflake Data Sharing, companies can derive new insights and value from data much more quickly and with significantly less effort than current data sharing methods. As a result, companies now have a new approach and a powerful new tool to get the full value out of their data assets.
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a modern data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. They all may sound great in theory, but I'll dig into the concerns you need to be aware of before taking the plunge. I’ll also include use cases so you can see what approach will work best for your big data needs. And I'll discuss Microsoft version of the data mesh.
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
Data Mesh is an innovative concept addressing many data challenges from an architectural, cultural, and organizational perspective. But is the world ready to implement Data Mesh?
In this session, we will review the importance of core Data Mesh principles, what they can offer, and when it is a good idea to try a Data Mesh architecture. We will discuss common challenges with implementation of Data Mesh systems and focus on the role of open-source projects for it. Projects like Apache Spark can play a key part in standardized infrastructure platform implementation of Data Mesh. We will examine the landscape of useful data engineering open-source projects to utilize in several areas of a Data Mesh system in practice, along with an architectural example. We will touch on what work (culture, tools, mindset) needs to be done to ensure Data Mesh is more accessible for engineers in the industry.
The audience will leave with a good understanding of the benefits of Data Mesh architecture, common challenges, and the role of Apache Spark and other open-source projects for its implementation in real systems.
This session is targeted for architects, decision-makers, data-engineers, and system designers.
Data Architecture Strategies: Data Architecture for Digital TransformationDATAVERSITY
MDM, data quality, data architecture, and more. At the same time, combining these foundational data management approaches with other innovative techniques can help drive organizational change as well as technological transformation. This webinar will provide practical steps for creating a data foundation for effective digital transformation.
Learn to Use Databricks for the Full ML LifecycleDatabricks
Machine learning development brings many new complexities beyond the traditional software development lifecycle. Unlike traditional software development, ML developers want to try multiple algorithms, tools and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. In this talk, learn how to operationalize ML across the full lifecycle with Databricks Machine Learning.
This describes a conceptual model approach to designing an enterprise data fabric. This is the set of hardware and software infrastructure, tools and facilities to implement, administer, manage and operate data operations across the entire span of the data within the enterprise across all data activities including data acquisition, transformation, storage, distribution, integration, replication, availability, security, protection, disaster recovery, presentation, analytics, preservation, retention, backup, retrieval, archival, recall, deletion, monitoring, capacity planning across all data storage platforms enabling use by applications to meet the data needs of the enterprise.
The conceptual data fabric model represents a rich picture of the enterprise’s data context. It embodies an idealised and target data view.
Designing a data fabric enables the enterprise respond to and take advantage of key related data trends:
• Internal and External Digital Expectations
• Cloud Offerings and Services
• Data Regulations
• Analytics Capabilities
It enables the IT function demonstrate positive data leadership. It shows the IT function is able and willing to respond to business data needs. It allows the enterprise to meet data challenges
• More and more data of many different types
• Increasingly distributed platform landscape
• Compliance and regulation
• Newer data technologies
• Shadow IT where the IT function cannot deliver IT change and new data facilities quickly
It is concerned with the design an open and flexible data fabric that improves the responsiveness of the IT function and reduces shadow IT.
Every business today wants to leverage data to drive strategic initiatives with machine learning, data science and analytics — but runs into challenges from siloed teams, proprietary technologies and unreliable data.
That’s why enterprises are turning to the lakehouse because it offers a single platform to unify all your data, analytics and AI workloads.
Join our How to Build a Lakehouse technical training, where we’ll explore how to use Apache SparkTM, Delta Lake, and other open source technologies to build a better lakehouse. This virtual session will include concepts, architectures and demos.
Here’s what you’ll learn in this 2-hour session:
How Delta Lake combines the best of data warehouses and data lakes for improved data reliability, performance and security
How to use Apache Spark and Delta Lake to perform ETL processing, manage late-arriving data, and repair corrupted data directly on your lakehouse
Wonder what this data mesh stuff is all about? What are the principles of data mesh? Can you or should you consider data mesh as the approach for your analytics platform? And most important - how can Snowflake help?
Given in Montreal on 14-Dec-2021
You Need a Data Catalog. Do You Know Why?Precisely
The data catalog has become a popular discussion topic within data management and data governance circles. A data catalog is a central repository that contains metadata for describing data sets, how they are defined, and where to find them. TDWI research indicates that implementing a data catalog is a top priority among organizations we survey. The data catalog can also play an important part in the governance process. It provides features that help ensure data quality, compliance, and that trusted data is used for analysis. Without an in-depth knowledge of data and associated metadata, organizations cannot truly safeguard and govern their data.
Join this on-demand webinar to learn more about the data catalog and its role in data governance efforts.
Topics include:
· Data management challenges and priorities
· The modern data catalog – what it is and why it is important
· The role of the modern data catalog in your data quality and governance programs
· The kinds of information that should be in your data catalog and why
A Work of Zhamak Dehghani
Principal consultant
ThoughtWorks
https://martinfowler.com/articles/data-monolith-to-mesh.html
https://fast.wistia.net/embed/iframe/vys2juvzc3?videoFoam
How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh
Many enterprises are investing in their next generation data lake, with the hope of democratizing data at scale to provide business insights and ultimately make automated intelligent decisions. Data platforms based on the data lake architecture have common failure modes that lead to unfulfilled promises at scale. To address these failure modes we need to shift from the centralized paradigm of a lake, or its predecessor data warehouse. We need to shift to a paradigm that draws from modern distributed architecture: considering domains as the first class concern, applying platform thinking to create self-serve data infrastructure, and treating data as a product.
How to Use a Semantic Layer to Deliver Actionable Insights at ScaleDATAVERSITY
Learn about using a semantic layer to enable actionable insights for everyone and streamline data and analytics access throughout your organization. This session will offer practical advice based on a decade of experience making semantic layers work for Enterprise customers.
Attend this session to learn about:
- Delivering critical business data to users faster than ever at scale using a semantic layer
- Enabling data teams to model and deliver a semantic layer on data in the cloud.
- Maintaining a single source of governed metrics and business data
- Achieving speed of thought query performance and consistent KPIs across any BI/AI tool like Excel, Power BI, Tableau, Looker, DataRobot, Databricks and more.
- Providing dimensional analysis capability that accelerates performance with no need to extract data from the cloud data warehouse
Who should attend this session?
Data & Analytics leaders and practitioners (e.g., Chief Data Officers, data scientists, data literacy, business intelligence, and analytics professionals).
Data Catalog as the Platform for Data IntelligenceAlation
Data catalogs are in wide use today across hundreds of enterprises as a means to help data scientists and business analysts find and collaboratively analyze data. Over the past several years, customers have increasingly used data catalogs in applications beyond their search & discovery roots, addressing new use cases such as data governance, cloud data migration, and digital transformation. In this session, the founder and CEO of Alation will discuss the evolution of the data catalog, the many ways in which data catalogs are being used today, the importance of machine learning in data catalogs, and discuss the future of the data catalog as a platform for a broad range of data intelligence solutions.
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
Enterprise Architecture (EA) provides a visual blueprint of the organization, and shows key interrelationships between data, process, applications, and more. By abstracting these assets in a graphical view, it’s possible to see key interrelationships, particularly as they relate to data and its business impact across the organization. Join us for a discussion on how data architecture is a key component of an overall enterprise architecture for enhanced business value and success.
Data Profiling, Data Catalogs and Metadata HarmonisationAlan McSweeney
These notes discuss the related topics of Data Profiling, Data Catalogs and Metadata Harmonisation. It describes a detailed structure for data profiling activities. It identifies various open source and commercial tools and data profiling algorithms. Data profiling is a necessary pre-requisite activity in order to construct a data catalog. A data catalog makes an organisation’s data more discoverable. The data collected during data profiling forms the metadata contained in the data catalog. This assists with ensuring data quality. It is also a necessary activity for Master Data Management initiatives. These notes describe a metadata structure and provide details on metadata standards and sources.
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
Today’s data-driven companies have a choice to make – where do we store our data? As the move to the cloud continues to be a driving factor, the choice becomes either the data warehouse (Snowflake et al) or the data lake (AWS S3 et al). There are pro’s and con’s for each approach. While the data warehouse will give you strong data management with analytics, they don’t do well with semi-structured and unstructured data with tightly coupled storage and compute, not to mention expensive vendor lock-in. On the other hand, data lakes allow you to store all kinds of data and are extremely affordable, but they’re only meant for storage and by themselves provide no direct value to an organization.
Enter the Open Data Lakehouse, the next evolution of the data stack that gives you the openness and flexibility of the data lake with the key aspects of the data warehouse like management and transaction support.
In this webinar, you’ll hear from Ali LeClerc who will discuss the data landscape and why many companies are moving to an open data lakehouse. Ali will share more perspective on how you should think about what fits best based on your use case and workloads, and how some real world customers are using Presto, a SQL query engine, to bring analytics to the data lakehouse.
Data protection and privacy regulations such as the EU’s General Data Protection Regulation (GDPR), the California Consumer Privacy Act (CCPA), and Singapore’s Personal Data Protection Act (PDPA) have been major drivers for data governance initiatives and the emergence of data catalog solutions. Organizations have an ever-increasing appetite to leverage their data for business advantage, either through internal collaboration, data sharing across ecosystems, direct commercialization, or as the basis for AI-driven business decision-making. This requires data governance and especially data asset catalog solutions to step up once again and enable data-driven businesses to leverage their data responsibly, ethically, compliantly, and accountably.
This presentation explores how data catalog has become a key technology enabler in overcoming these challenges.
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
How do you turn data from many different sources into actionable insights and manufacture those insights into innovative information-based products and services?
Industry leaders are accomplishing this by adding Hadoop as a critical component in their modern data architecture to build a data lake. A data lake collects and stores data across a wide variety of channels including social media, clickstream data, server logs, customer transactions and interactions, videos, and sensor data from equipment in the field. A data lake cost-effectively scales to collect and retain massive amounts of data over time, and convert all this data into actionable information that can transform your business.
Join Hortonworks and Informatica as we discuss:
- What is a data lake?
- The modern data architecture for a data lake
- How Hadoop fits into the modern data architecture
- Innovative use-cases for a data lake
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...DATAVERSITY
The majority of successful organizations in today’s economy are data-driven, and innovative companies are looking at new ways to leverage data and information for strategic advantage. While the opportunities are vast, and the value has clearly been shown across a number of industries in using data to strategic advantage, the choices in technology can be overwhelming. From Big Data to Artificial Intelligence to Data Lakes and Warehouses, the industry is continually evolving to provide new and exciting technological solutions.
This webinar will help make sense of the various data architectures & technologies available, and how to leverage them for business value and success. A practical framework will be provided to generate “quick wins” for your organization, while at the same time building towards a longer-term sustainable architecture. Case studies will also be provided to show how successful organizations have successfully built a data strategies to support their business goals.
Ontologies for Emergency & Disaster Management Stephane Fellah
Ogc meeting march 2014
OGC OWS-10 Cross-Community Interoperability
Ontologies for Emergency & Disaster Management
(The application of geospatial linked data)
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
Data Mesh is an innovative concept addressing many data challenges from an architectural, cultural, and organizational perspective. But is the world ready to implement Data Mesh?
In this session, we will review the importance of core Data Mesh principles, what they can offer, and when it is a good idea to try a Data Mesh architecture. We will discuss common challenges with implementation of Data Mesh systems and focus on the role of open-source projects for it. Projects like Apache Spark can play a key part in standardized infrastructure platform implementation of Data Mesh. We will examine the landscape of useful data engineering open-source projects to utilize in several areas of a Data Mesh system in practice, along with an architectural example. We will touch on what work (culture, tools, mindset) needs to be done to ensure Data Mesh is more accessible for engineers in the industry.
The audience will leave with a good understanding of the benefits of Data Mesh architecture, common challenges, and the role of Apache Spark and other open-source projects for its implementation in real systems.
This session is targeted for architects, decision-makers, data-engineers, and system designers.
Data Architecture Strategies: Data Architecture for Digital TransformationDATAVERSITY
MDM, data quality, data architecture, and more. At the same time, combining these foundational data management approaches with other innovative techniques can help drive organizational change as well as technological transformation. This webinar will provide practical steps for creating a data foundation for effective digital transformation.
Learn to Use Databricks for the Full ML LifecycleDatabricks
Machine learning development brings many new complexities beyond the traditional software development lifecycle. Unlike traditional software development, ML developers want to try multiple algorithms, tools and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. In this talk, learn how to operationalize ML across the full lifecycle with Databricks Machine Learning.
This describes a conceptual model approach to designing an enterprise data fabric. This is the set of hardware and software infrastructure, tools and facilities to implement, administer, manage and operate data operations across the entire span of the data within the enterprise across all data activities including data acquisition, transformation, storage, distribution, integration, replication, availability, security, protection, disaster recovery, presentation, analytics, preservation, retention, backup, retrieval, archival, recall, deletion, monitoring, capacity planning across all data storage platforms enabling use by applications to meet the data needs of the enterprise.
The conceptual data fabric model represents a rich picture of the enterprise’s data context. It embodies an idealised and target data view.
Designing a data fabric enables the enterprise respond to and take advantage of key related data trends:
• Internal and External Digital Expectations
• Cloud Offerings and Services
• Data Regulations
• Analytics Capabilities
It enables the IT function demonstrate positive data leadership. It shows the IT function is able and willing to respond to business data needs. It allows the enterprise to meet data challenges
• More and more data of many different types
• Increasingly distributed platform landscape
• Compliance and regulation
• Newer data technologies
• Shadow IT where the IT function cannot deliver IT change and new data facilities quickly
It is concerned with the design an open and flexible data fabric that improves the responsiveness of the IT function and reduces shadow IT.
Every business today wants to leverage data to drive strategic initiatives with machine learning, data science and analytics — but runs into challenges from siloed teams, proprietary technologies and unreliable data.
That’s why enterprises are turning to the lakehouse because it offers a single platform to unify all your data, analytics and AI workloads.
Join our How to Build a Lakehouse technical training, where we’ll explore how to use Apache SparkTM, Delta Lake, and other open source technologies to build a better lakehouse. This virtual session will include concepts, architectures and demos.
Here’s what you’ll learn in this 2-hour session:
How Delta Lake combines the best of data warehouses and data lakes for improved data reliability, performance and security
How to use Apache Spark and Delta Lake to perform ETL processing, manage late-arriving data, and repair corrupted data directly on your lakehouse
Wonder what this data mesh stuff is all about? What are the principles of data mesh? Can you or should you consider data mesh as the approach for your analytics platform? And most important - how can Snowflake help?
Given in Montreal on 14-Dec-2021
You Need a Data Catalog. Do You Know Why?Precisely
The data catalog has become a popular discussion topic within data management and data governance circles. A data catalog is a central repository that contains metadata for describing data sets, how they are defined, and where to find them. TDWI research indicates that implementing a data catalog is a top priority among organizations we survey. The data catalog can also play an important part in the governance process. It provides features that help ensure data quality, compliance, and that trusted data is used for analysis. Without an in-depth knowledge of data and associated metadata, organizations cannot truly safeguard and govern their data.
Join this on-demand webinar to learn more about the data catalog and its role in data governance efforts.
Topics include:
· Data management challenges and priorities
· The modern data catalog – what it is and why it is important
· The role of the modern data catalog in your data quality and governance programs
· The kinds of information that should be in your data catalog and why
A Work of Zhamak Dehghani
Principal consultant
ThoughtWorks
https://martinfowler.com/articles/data-monolith-to-mesh.html
https://fast.wistia.net/embed/iframe/vys2juvzc3?videoFoam
How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh
Many enterprises are investing in their next generation data lake, with the hope of democratizing data at scale to provide business insights and ultimately make automated intelligent decisions. Data platforms based on the data lake architecture have common failure modes that lead to unfulfilled promises at scale. To address these failure modes we need to shift from the centralized paradigm of a lake, or its predecessor data warehouse. We need to shift to a paradigm that draws from modern distributed architecture: considering domains as the first class concern, applying platform thinking to create self-serve data infrastructure, and treating data as a product.
How to Use a Semantic Layer to Deliver Actionable Insights at ScaleDATAVERSITY
Learn about using a semantic layer to enable actionable insights for everyone and streamline data and analytics access throughout your organization. This session will offer practical advice based on a decade of experience making semantic layers work for Enterprise customers.
Attend this session to learn about:
- Delivering critical business data to users faster than ever at scale using a semantic layer
- Enabling data teams to model and deliver a semantic layer on data in the cloud.
- Maintaining a single source of governed metrics and business data
- Achieving speed of thought query performance and consistent KPIs across any BI/AI tool like Excel, Power BI, Tableau, Looker, DataRobot, Databricks and more.
- Providing dimensional analysis capability that accelerates performance with no need to extract data from the cloud data warehouse
Who should attend this session?
Data & Analytics leaders and practitioners (e.g., Chief Data Officers, data scientists, data literacy, business intelligence, and analytics professionals).
Data Catalog as the Platform for Data IntelligenceAlation
Data catalogs are in wide use today across hundreds of enterprises as a means to help data scientists and business analysts find and collaboratively analyze data. Over the past several years, customers have increasingly used data catalogs in applications beyond their search & discovery roots, addressing new use cases such as data governance, cloud data migration, and digital transformation. In this session, the founder and CEO of Alation will discuss the evolution of the data catalog, the many ways in which data catalogs are being used today, the importance of machine learning in data catalogs, and discuss the future of the data catalog as a platform for a broad range of data intelligence solutions.
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
Enterprise Architecture (EA) provides a visual blueprint of the organization, and shows key interrelationships between data, process, applications, and more. By abstracting these assets in a graphical view, it’s possible to see key interrelationships, particularly as they relate to data and its business impact across the organization. Join us for a discussion on how data architecture is a key component of an overall enterprise architecture for enhanced business value and success.
Data Profiling, Data Catalogs and Metadata HarmonisationAlan McSweeney
These notes discuss the related topics of Data Profiling, Data Catalogs and Metadata Harmonisation. It describes a detailed structure for data profiling activities. It identifies various open source and commercial tools and data profiling algorithms. Data profiling is a necessary pre-requisite activity in order to construct a data catalog. A data catalog makes an organisation’s data more discoverable. The data collected during data profiling forms the metadata contained in the data catalog. This assists with ensuring data quality. It is also a necessary activity for Master Data Management initiatives. These notes describe a metadata structure and provide details on metadata standards and sources.
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
Today’s data-driven companies have a choice to make – where do we store our data? As the move to the cloud continues to be a driving factor, the choice becomes either the data warehouse (Snowflake et al) or the data lake (AWS S3 et al). There are pro’s and con’s for each approach. While the data warehouse will give you strong data management with analytics, they don’t do well with semi-structured and unstructured data with tightly coupled storage and compute, not to mention expensive vendor lock-in. On the other hand, data lakes allow you to store all kinds of data and are extremely affordable, but they’re only meant for storage and by themselves provide no direct value to an organization.
Enter the Open Data Lakehouse, the next evolution of the data stack that gives you the openness and flexibility of the data lake with the key aspects of the data warehouse like management and transaction support.
In this webinar, you’ll hear from Ali LeClerc who will discuss the data landscape and why many companies are moving to an open data lakehouse. Ali will share more perspective on how you should think about what fits best based on your use case and workloads, and how some real world customers are using Presto, a SQL query engine, to bring analytics to the data lakehouse.
Data protection and privacy regulations such as the EU’s General Data Protection Regulation (GDPR), the California Consumer Privacy Act (CCPA), and Singapore’s Personal Data Protection Act (PDPA) have been major drivers for data governance initiatives and the emergence of data catalog solutions. Organizations have an ever-increasing appetite to leverage their data for business advantage, either through internal collaboration, data sharing across ecosystems, direct commercialization, or as the basis for AI-driven business decision-making. This requires data governance and especially data asset catalog solutions to step up once again and enable data-driven businesses to leverage their data responsibly, ethically, compliantly, and accountably.
This presentation explores how data catalog has become a key technology enabler in overcoming these challenges.
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
How do you turn data from many different sources into actionable insights and manufacture those insights into innovative information-based products and services?
Industry leaders are accomplishing this by adding Hadoop as a critical component in their modern data architecture to build a data lake. A data lake collects and stores data across a wide variety of channels including social media, clickstream data, server logs, customer transactions and interactions, videos, and sensor data from equipment in the field. A data lake cost-effectively scales to collect and retain massive amounts of data over time, and convert all this data into actionable information that can transform your business.
Join Hortonworks and Informatica as we discuss:
- What is a data lake?
- The modern data architecture for a data lake
- How Hadoop fits into the modern data architecture
- Innovative use-cases for a data lake
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...DATAVERSITY
The majority of successful organizations in today’s economy are data-driven, and innovative companies are looking at new ways to leverage data and information for strategic advantage. While the opportunities are vast, and the value has clearly been shown across a number of industries in using data to strategic advantage, the choices in technology can be overwhelming. From Big Data to Artificial Intelligence to Data Lakes and Warehouses, the industry is continually evolving to provide new and exciting technological solutions.
This webinar will help make sense of the various data architectures & technologies available, and how to leverage them for business value and success. A practical framework will be provided to generate “quick wins” for your organization, while at the same time building towards a longer-term sustainable architecture. Case studies will also be provided to show how successful organizations have successfully built a data strategies to support their business goals.
Ontologies for Emergency & Disaster Management Stephane Fellah
Ogc meeting march 2014
OGC OWS-10 Cross-Community Interoperability
Ontologies for Emergency & Disaster Management
(The application of geospatial linked data)
Apache Hadoop YARN is the modern Distributed Operating System. It enables the Hadoop compute layer to be a common resource-management platform that can host a wide variety of applications. Multiple organizations are able to leverage YARN in building their applications on top of Hadoop without themselves repeatedly worrying about resource management, isolation, multi-tenancy issues etc.
In this talk, we’ll first hit the ground with the current status of Apache Hadoop YARN – how it is faring today in deployments large and small. We will cover different types of YARN deployments, in different environments and scale.
We'll then move on to the exciting present & future of YARN – features that are further strengthening YARN as the first-class resource-management platform for datacenters running enterprise Hadoop. We’ll discuss the current status as well as the future promise of features and initiatives like – 10x scheduler throughput improvements, docker containers support on YARN, support for long running services (alongside applications) natively without any changes, seamless application upgrades, fine-grained isolation for multi-tenancy using CGroups on disk & network resources, powerful scheduling features like application priorities, intra-queue preemption across applications and operational enhancements including insights through Timeline Service V2, a new web UI and better queue management.
Speaker:
Sunil Govindan, Senior Software Engineer, Hortonworks
Rohith Sharma K S, Senior Software Engineer, Hortonworks
Software Design PatternsConsider a company migrating to a third-p.pdfarorastores
Software Design Patterns:
Consider a company migrating to a third-party cloud-based solution from an internally
maintained ecosystem of applications utilizing one current-generation database system, as well
as a legacy system for older data. They plan to migrate all data to the cloud based solution in
time. But, for now, they are going to transition to the new cloud-based applications and the
cloud-based database for new data, but will rely upon the existing and legacy database for older
data. The databases have approximately the same functionality, but different interfaces and
languages.
What design pattern highlights the most significant challenge associated with integrating the
different databases (as well as one way of addressing it)?
What is that challenge?
Briefly, and in English, describe how the pattern teaches that we should approach this problem?
In other words, what is the pattern that should follow for the solution?
Solution
Design patterns like Factory pattern,Singleton pattern etc basically provide solutions to general
problems which are faced by software developers during the development phase. These patterns
do not play any role in Data migration.
There are four stages in Data Migration. They are:
1.Semantic Data models which comprises of the Dimensional models,Semantic models,
Mapping to Semantic building blocks.
2. Data Mapping Specifications which is used to translate Source data to target data.
3. KPIs and Data lineage which is useful in establishing the data lineage for the org and other
rightful requirements.
4. End-to-End scope of Data models is used to standardise data that is loaded in the Data
Warehouse.
Please follow the list of steps provided below while migrating data to the cloud:
1. Assessing the requirements and then plan.
2.Disintegrate the dependencies after the initial assessments.
3. Redesign, re-program and reintegrate.
4. Testing of new migrated components.
5. Fine tuning and training.
However, there would be technical issues while data migration. Many firms which migrate the
data to the cloud, proceed in a hybrid model, keeping key elementss of their infrastructure
inhouse and under their comtrol while they outsource less sensitive or core components.
Cloud vendors would always expect the customers to provide or develop a virtual image jointly
that specifies their basic server configuration, which is offered as a service after being built
inside the cloud. It is required that the IT team also have the skillset tp create a VM template
which includes infrastructure, application and security that is required by the enterprise..
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsAndrew Morgan
Understand how you can get the benefits you're looking for from NoSQL data stores without sacrificing the power and flexibility of the world's most popular open source database - MySQL.
Similar to Adopting a Canonical Data Model - how to apply to an existing environment with web services (SOA and REST) (20)
Fluentd – Making Logging Easy & Effective in a Multi-cloud & Hybrid Environme...Phil Wilkins
Presentation I gave to Developer Week Europe 2022 on the use of Fluentd in Hybrid and distributed use cases.
This builds on previous Fluentd presentations
London Oracle Developer Meetup Presented by Luis Weir (@luisw19) and myself
The presentation focuses on APIs and microservices (a lot of discussion on the later)
Exploring Career Paths in Cybersecurity for Technical CommunicatorsBen Woelk, CISSP, CPTC
Brief overview of career options in cybersecurity for technical communicators. Includes discussion of my career path, certification options, NICE and NIST resources.
Want to move your career forward? Looking to build your leadership skills while helping others learn, grow, and improve their skills? Seeking someone who can guide you in achieving these goals?
You can accomplish this through a mentoring partnership. Learn more about the PMISSC Mentoring Program, where you’ll discover the incredible benefits of becoming a mentor or mentee. This program is designed to foster professional growth, enhance skills, and build a strong network within the project management community. Whether you're looking to share your expertise or seeking guidance to advance your career, the PMI Mentoring Program offers valuable opportunities for personal and professional development.
Watch this to learn:
* Overview of the PMISSC Mentoring Program: Mission, vision, and objectives.
* Benefits for Volunteer Mentors: Professional development, networking, personal satisfaction, and recognition.
* Advantages for Mentees: Career advancement, skill development, networking, and confidence building.
* Program Structure and Expectations: Mentor-mentee matching process, program phases, and time commitment.
* Success Stories and Testimonials: Inspiring examples from past participants.
* How to Get Involved: Steps to participate and resources available for support throughout the program.
Learn how you can make a difference in the project management community and take the next step in your professional journey.
About Hector Del Castillo
Hector is VP of Professional Development at the PMI Silver Spring Chapter, and CEO of Bold PM. He's a mid-market growth product executive and changemaker. He works with mid-market product-driven software executives to solve their biggest growth problems. He scales product growth, optimizes ops and builds loyal customers. He has reduced customer churn 33%, and boosted sales 47% for clients. He makes a significant impact by building and launching world-changing AI-powered products. If you're looking for an engaging and inspiring speaker to spark creativity and innovation within your organization, set up an appointment to discuss your specific needs and identify a suitable topic to inspire your audience at your next corporate conference, symposium, executive summit, or planning retreat.
About PMI Silver Spring Chapter
We are a branch of the Project Management Institute. We offer a platform for project management professionals in Silver Spring, MD, and the DC/Baltimore metro area. Monthly meetings facilitate networking, knowledge sharing, and professional development. For event details, visit pmissc.org.
The Impact of Artificial Intelligence on Modern Society.pdfssuser3e63fc
Just a game Assignment 3
1. What has made Louis Vuitton's business model successful in the Japanese luxury market?
2. What are the opportunities and challenges for Louis Vuitton in Japan?
3. What are the specifics of the Japanese fashion luxury market?
4. How did Louis Vuitton enter into the Japanese market originally? What were the other entry strategies it adopted later to strengthen its presence?
5. Will Louis Vuitton have any new challenges arise due to the global financial crisis? How does it overcome the new challenges?Assignment 3
1. What has made Louis Vuitton's business model successful in the Japanese luxury market?
2. What are the opportunities and challenges for Louis Vuitton in Japan?
3. What are the specifics of the Japanese fashion luxury market?
4. How did Louis Vuitton enter into the Japanese market originally? What were the other entry strategies it adopted later to strengthen its presence?
5. Will Louis Vuitton have any new challenges arise due to the global financial crisis? How does it overcome the new challenges?Assignment 3
1. What has made Louis Vuitton's business model successful in the Japanese luxury market?
2. What are the opportunities and challenges for Louis Vuitton in Japan?
3. What are the specifics of the Japanese fashion luxury market?
4. How did Louis Vuitton enter into the Japanese market originally? What were the other entry strategies it adopted later to strengthen its presence?
5. Will Louis Vuitton have any new challenges arise due to the global financial crisis? How does it overcome the new challenges?
New Explore Careers and College Majors 2024Dr. Mary Askew
Explore Careers and College Majors is a new online, interactive, self-guided career, major and college planning system.
The career system works on all devices!
For more Information, go to https://bit.ly/3SW5w8W
Adopting a Canonical Data Model - how to apply to an existing environment with web services (SOA and REST)
1. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 1)
phil@mp3monster.org
www.mp3monster.org
‘How to implement a canonical data
model in an existing SOA estate’
2. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 2)
phil@mp3monster.org
www.mp3monster.org
Introduction
• The following deck attempts to address the question:
• ‘How to implement a canonical data model in an existing SOA
estate’
• To address this we need to understand a number of
things:
– Assumptions on the current state of affairs
– The value proposition of adopting a canonical model – no
point in an adoption approach that doesn’t deliver value
(with as tangible or intangible benefits)
– The strategies best suited to delivering the goal
– Appreciate the risks we may expose
3. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 3)
phil@mp3monster.org
www.mp3monster.org
Assumptions
• By ‘SOA environment’ we assume to mean capability
centric services primarily built with SOAP/WSDL/XSD and
REST/JSON technologies
• By ‘data model’ we presume to mean data definitions
used in middleware rather than underlying application
and data warehouse/marts
• Assumption that the existing estate doesn’t have an
interface versioning strategy applied across the board
• Services are woven together to deliver larger capabilities
by an ESB
• The approach should be vendor agnostic
4. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 4)
phil@mp3monster.org
www.mp3monster.org
What do we mean by a Canonical Data Model
• The following definition from Forrester 2010 (as part of a
blog on a modelling conference1)
A canonical information model is a model of the semantics and structure
of information that adheres to a set of rules agreed upon within a defined
context for communicating among a set of applications or parties.
• The essence of the various definitions is:
– Internally consistent description of data
– Standard terminology and meaning
– Commonly accepted by all providers & consumers involved in
orchestrating interactions of any form
– Definitions are largely technology agnostic (although typically
not free of the under pinning representation i.e. XML/XSD of
SQL).
1 http://blogs.forrester.com/mike_gilpin/10-03-15-field_first_annual_canonical_model_management_forum
5. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 5)
phil@mp3monster.org
www.mp3monster.org
Value of a Canonical Model
• Semantic Consistency - which allows interactions to have a common
meaning so no problems of your gadget is my widget
– This means mapping data from an event pay load or for WS invocations is easy &
less prone to mapping errors
• Structural Consistency – so the definition of common data items is always
the same
– Eliminates risks of transformation errors
– Potential to reduce transformations in an orchestrated sequence of operations –
meaning greater throughput
• Reduced Design Effort – choose appropriate definitions not create them
– Picking data definitions from a set of models is easier & less error prone than
designing from scratch
• Increases chance of Information Rich integration
– with a predefined data definition increases chances of providing information rich
events as you’re just populating objects
– Information rich events, raises chances of plug and play integration (event types
match, data shared less likely to need changes to get more data)
6. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 6)
phil@mp3monster.org
www.mp3monster.org
Look at a hypothetical integration and how
Canonical Model adoption can change it
7. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 7)
phil@mp3monster.org
www.mp3monster.org
Key
Aggregate
Split
Transform
Store
Tx Endpoint
Endpoint
Pipe (+Filter)
Content Route
De/Normalize
Enrich
Canonical Data
App Data
Icons from Hhope & Woolfe EAIPatterns.com
Organic Growth & Non Canonical Model
• Organic Growth
• No canonical model
• Creating need for
multiple
transforms &
related
operations
• Some operations may
undo previous
operation
• Canonical
application models
excluded here
8. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 8)
phil@mp3monster.org
www.mp3monster.org
Key
Aggregate
Split
Transform
Store
Tx Endpoint
Endpoint
Pipe (+Filter)
Content Route
De/Normalize
Enrich
Canonical Data
App Data
Icons from Hhope & Woolfe EAIPatterns.com
Same Systems with Canonical Model
(systems not canonical conversant)
• Greatly simplified
as each system is
fronted by a
transform
to/from local
representation to
canonical
9. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 9)
phil@mp3monster.org
www.mp3monster.org
Key
Aggregate
Split
Transform
Store
Tx Endpoint
Endpoint
Pipe (+Filter)
Content Route
De/Normalize
Enrich
Canonical Data
App Data
Icons from Hhope & Woolfe EAIPatterns.com
Same Systems with Canonical Model
(some systems canonical conversant)
• Number of
transformations
reduced
• Middleware
purely becomes a
routing / pub-sub
delivery
10. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 10)
phil@mp3monster.org
www.mp3monster.org
Select or Create own Canonical Model?
• Industry standard models cover wider number of domains, but will
not provide 100% fit all the time
• Creating works …
– In a closed, non SaaS/COTs environment create custom canonical model
by deliver benefits
• closer match to service implementation – performance gain
• Alignment to business language
– Create need to take into account lessons from designing enterprise
application/DB data models
• Select a standard model means
– Leverage accumulated good practise lessons learnt/data needs for
interoperability
– Industry models likely to be 80/20 fit you will need own definitions for
business specific concepts e.g.
• in optical retail need to extend standard definition of Item with definition from
Vision Council of America clinical elements of lens shape & cut
– Model selection(s) need to be done with care
– Make sure model(s) are sufficiently mature
11. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 11)
phil@mp3monster.org
www.mp3monster.org
Technical Strategies / Decisions Needed
12. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 12)
phil@mp3monster.org
www.mp3monster.org
Interface/Payload Versioning
• Interfaces or the payload need to have a versioning
strategy as they will change overtime, can be applied by
– URI – works very well for REST
– XSD schema versioning – common, but a problem for
REST+JSON
• Need to know how many versions to actively support
– common to keep current + 1
– factors to account for rate of change & interface users ability to
accommodate the rate of change
• Determine approach handling
– Common URL + ESB conditional logic
– Separate URLs + ESB logic re-use
13. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 13)
phil@mp3monster.org
www.mp3monster.org
Versioning Existing Interfaces – Some Options
• If existing interface/payload has no version
– ESB can use absence as implicit version 0.
– Requires routing conditional logic
• If interface does exist, then
– If versioning uses same strategy then recommend new interface
URIs
– If different can share URI and use ESB to determine version
– Requires routing conditional logic
• Simply create slightly different URI for replacement
services
– Increased volume of code,
– endpoint user aware of change
(less desirable)
– conditioning is implicit
14. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 14)
phil@mp3monster.org
www.mp3monster.org
Transition States
• There will always be a period of transitionary state when
adopting major change such as canonical model
– Therefore when passing on or starting a sequence of event(s)
what do I assume about down stream capability?
• This can be addressed by one of several strategies:
– Late binding using UDDI or equivalent and discover service and
version of interface available – great if overhead is not a
problem
– Assume latest version (predicated on ability to transform to
previous version) & programme of work provides proxy to
legacy interface which transforms down
– Software controlled switching of output (not desirable as
embeds knowledge of consumers into a service)
15. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 15)
phil@mp3monster.org
www.mp3monster.org
REST+JSON Question
• Canonical models in support of the middleware are
typical XSD based today - good for SOAP WSDL services
but REST + JSON more challenging as no schema needed
• However organisations starting to offer JSON models e.g.
part of OASIS, OAGIS
• Could use REST + XML (more like RPC than proper REST)
• Could publish JSON mapped representation (tooling
available) with description via JSON Schema (IETF draft)
– Safest when offering special custom service
– Still delivers benefit for internal services
• Remember R (in REST) is for Resource and ideally you
want resources to be consistent in definition
16. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 16)
phil@mp3monster.org
www.mp3monster.org
Challenges of Abstraction vs Endpoint Needs
• Current interfaces maybe geared towards supporting
specific platforms e.g.
– Phone, thick client, IoT (Internet of Things i.e. agent
devices such as smart meters)
– E.g strip generic message to only necessary elements as
device can only handle small payloads
• Strategy for this is to add layer between core
canonical & ESB an endpoint aware transformation
– Means core routing / business aspects of ESB not impacted
– so changing routing etc not impacted
– Clients not needing adaptation can talk directly to
canonical layer
17. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 17)
phil@mp3monster.org
www.mp3monster.org
Governance
• Engaging with all these previous considerations will
need to be factored into any Governance processes..
– Design Time
• Assurance that the correct approaches identified are being
adopted
• Adoption is for the right reasons
– Execution Time
• Ability to ascertain the adoption, efficacy etc
• We started out with the
declaration that we’re working in a
SOA context, which should mean
– SOA Governance is in place and can
support these Governance goals Open Group
http://www.opengroup.org/soa/source-book/gov/sgvm_artifacts.jpg
18. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 18)
phil@mp3monster.org
www.mp3monster.org
Understanding Why & How To Adopt
Canonical Model we can look at execution
19. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 19)
phil@mp3monster.org
www.mp3monster.org
Implementation Strategy
• Define architectural strategies (i.e. engage with
previous rational and challenges)
• Need to ensure ground work is in place to enable
correct development, could be delivered by
– Reference implementation
– Documentation set with policy & practise
– Very detailed requirements (including test definitions)
– Support through architect involved in pair programming
– workshops
20. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 20)
phil@mp3monster.org
www.mp3monster.org
Implementation Strategy (1)
• Start small & grow as …
– Knowledge & understanding develops
– Principles, ideas and approach are refined
– Help manage risk & impact
– Can make ensure initial work is ‘referencable’
• Assess & Measure
– Helps build cost/benefit -- ROI insights (both hard and soft
factors)
– Informs planning & estimation downstream
– Ensure implementation quality & sustainability
21. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 21)
phil@mp3monster.org
www.mp3monster.org
TOGAF View of realising Canonical Model
Defined objective for how and
why a Canonical Model be
adopted (assuming other
principles etc already set)
Establish business direction of
travel so we can identify
suitable model(s), opportunities
for a pilot
Determine key business
data structures
Build the Tech Ref
Model & Stds
Information Base
Look at opportunities
for piloting canonical
adoption
As not greenfield
transition strategy is
needed
Hands on support
– key to identify
lessons and
approaches to
accelerate & ease
adoption
Apply refinements
to pilot. Depending
upon scope plan
next cycle
Set direction of travel,
scope for change
22. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 22)
phil@mp3monster.org
www.mp3monster.org
Activities from a Execution Sequence
Perspective
Identify Canonical model(s)
& Strategies
•Which model(s) to use
•Approaches/impact to handling
transition
Develop Model Knowledge
•Interpretation/Language
•Versioning
•Ensure guidance & supporting
information is ready
Determine non Critical
integration programme
•Scope cover various patterns of
use and impact
•Agree benchmarks to establish
value
•Develop detailed implementation
plan
Start Development
•Ensure testing of existing
interfaces are in place so can
assure of no impact
•Develop initial interfaces inc
interface & e2e tests
Regression Testing
•Ensure that different message
types & versions exercised
•Check for changes in type within
end to end execution
•Deployment strategy included
Performance & Other
PreProd Tests
•Canonical models can be heavier –
therefore ensure performance is
considered
•Assess Value of approach
Assuming Success…
•Expand adoption
•KT to wider team etc
•Programme of full adoption
Establish Monitoring
•Need to determine when legacy
interfaces stop being used
•Retire interfaces at appropriate
time
Iterate
development
process
23. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 23)
phil@mp3monster.org
www.mp3monster.org
Reminder & Questions
24. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 24)
phil@mp3monster.org
www.mp3monster.org
Remember!
• This has been done before – so make sure you’re
considering best practise recommendations
(particularly from preferred vendors)
25. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 25)
phil@mp3monster.org
www.mp3monster.org
Questions
26. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 26)
phil@mp3monster.org
www.mp3monster.org
Thankyou
Editor's Notes
Notes:
Approach:
Need uptake enablement
Tech considerations
- fit within larger dev lifecycle
On basis that this presentation has been requested – the minimal personal introduction
If audience size small then questions as we go
SOA can mean many things to many people – so lets declare the interpretation
Differentiate data model from DB perspective and middleware
Some assumptions made on premise that its an excuse to illustrate some useful points / thinking
This quote comes from Forrester blog by Mike Gilpin on March 15 2010 : http://blogs.forrester.com/mike_gilpin/10-03-15-field_first_annual_canonical_model_management_forum
Alternate definitions:
http://www.theintegrationengineer.com/canonical-data/
http://www.information-management.com/issues/2007_50/10001733-1.html
http://www.eai-ideas.com/architecture-ideas/soa-and-canonical-data-model-cdm
http://soapatterns.org/design_patterns/canonical_schema
http://www.soa-probe.com/2010/09/canonical-data-model.html#more
http://blogs.msdn.com/b/nickmalik/archive/2007/06/12/canonical-model-canonical-schema-and-event-driven-soa.aspx
http://xml.fido.gov/documents/completed/oagi/oagis.htm
To understand what value a canonical data model can bring – necessary to determine what changes, therefore how to best go about adoption
Erl P62 – Service Orientated Architecture – Concepts, Technology & Design – 3.4.5 - leverage XML capabilities to richly define the data, ground work for intrinisic interoperability
Cost and effort of application design is reduced after proliferation of standardized XML data
http://blog.digitalml.com/canonical-models-should-be-a-core-component-of-your-api-strategy/
http://blogs.forrester.com/mike_gilpin/12-05-29-canonical_information_models_play_important_role_in_api_layers_increasing_service_reuse
Next couple of slides illustrate the potential value of adopting a canonical model
Integrations added in a fairly unstructured patch things on
The sort of thing that can happen as an evolution on from point to point connectivity
Visualisation of the canonical data structures not shown
All end points are abstracted by a transformation that converts the local data structures to/from a canonical representation (now shown here)
As the diagram shows – the bulk of activity is transform in/out bound from end points and then just routing
Eliminating transforms and counter transforms
Risk of modifying routing/sequencing greatly reduced
Many COTs products can handle canonical data models.
Custom developed solutions can be developed to be conversant with the canonical data model eliminating transform
Middleware move towards routing considerations
E2e more efficient
Source model from OAGI / OASIS / eTOM or develop own
Developing own Model will be time consuming and challenging – need for extensibility strategy critical
As we changing an existing estate need to handle changes to the interface
Canonical model typically based on XSD but REST favours JSON payload
Tools options to assist:
http://javaoraclesoa.blogspot.co.uk/2012/12/a-reusable-solution-for-conversion.html
http://www.balisage.net/Proceedings/vol7/html/Lee01/BalisageVol7-Lee01.html
https://www.oasis-open.org/resources/topics/rest-json
http://www.jsonschema.net/index.html
Canonical Data Model management should be subject to some form of governance framework
If you’re practising full SOA governance then will already be the case
As we’ve focused on all the why and benefits which should result in all the architectural decisioning having been made (a pre-requisite)
We can focus on the mechanics of converting architecture into a reality
In terms of ground work some form of documentation is necessary to enable growth, enable informed decisioning by others in the future. Remember even Agile says:
We value … Working software over comprehensive documentation (http://agilemanifesto.org/)
As we’ve focused on all the why and benefits which should result in all the architectural decisioning having been made (a pre-requisite)
We can focus on the mechanics of converting architecture into a reality
In terms of ground work some form of documentation is necessary to enable growth, enable informed decisioning by others in the future. Remember even Agile says:
We value … Working software over comprehensive documentation (http://agilemanifesto.org/)
You’ll note that the proposed sequence doesn’t perfectly follow TOGAF A E. BUT do cover all the bases. F H is followed more tightly
TOGAF recognises it maybe need to be iterative