The document discusses two approaches to managing domains in a data mesh architecture: the open model and strict model. The open model gives domain teams freedom to choose their own tools and data storage, requiring reliable teams to avoid inconsistencies. The strict model predefines domain environments without customization allowed and puts central management on data persistence, ensuring consistency but requiring more platform implementation. Both have pros and cons depending on the organization and use case.
Presentation on Data Mesh: The paradigm shift is a new type of eco-system architecture, which is a shift left towards a modern distributed architecture in which it allows domain-specific data and views “data-as-a-product,” enabling each domain to handle its own data pipelines.
Data Mesh in Azure using Cloud Scale Analytics (WAF)Nathan Bijnens
This document discusses moving from a centralized data architecture to a distributed data mesh architecture. It describes how a data mesh shifts data management responsibilities to individual business domains, with each domain acting as both a provider and consumer of data products. Key aspects of the data mesh approach discussed include domain-driven design, domain zones to organize domains, treating data as products, and using this approach to enable analytics at enterprise scale on platforms like Azure.
Know whether cloud based storage or dedicated storage is best for your business IT infrastructure depending on our organization requirements. Check Netmagic’s outlooks.
Data lakes are central repositories that store large volumes of structured, unstructured, and semi-structured data. They are ideal for machine learning use cases and support SQL-based access and programmatic distributed data processing frameworks. Data lakes can store data in the same format as its source systems or transform it before storing it. They support native streaming and are best suited for storing raw data without an intended use case. Data quality and governance practices are crucial to avoid a data swamp. Data lakes enable end-users to leverage insights for improved business performance and enable advanced analytics.
Data and Application Modernization in the Age of the Cloudredmondpulver
Data modernization is key to unlocking the full potential of your IT investments, both on premises and in the cloud. Enterprises and organizations of all sizes rely on their data to power advanced analytics, machine learning, and artificial intelligence.
Yet the path to modernizing legacy data systems for the cloud is full of pitfalls that cost time, money, and resources. These issues include high hardware and staffing costs, difficulty moving data and analytical processes to cloud environments, and inadequate support for real-time use cases. These issues delay delivery timelines and increase costs, impacting the return on investment for new, cutting-edge applications.
Watch this webinar in which James Kobielus, TDWI senior research director for data management, explores how enterprises are modernizing their mainframe data and application infrastructures in the cloud to sustain innovation and drive efficiencies. Kobielus will engage John de Saint Phalle, senior product manager at Precisely, in a discussion that addresses the following key questions:
When should enterprises consider migrating and replicating all their data assets to modern public clouds vs. retaining some on-premises in hybrid deployments?How should enterprises modernize their legacy data and application infrastructures to unlock innovation and value in the age of cloud computing?What are the key investments that enterprises should make to modernize their data pipelines to deliver better AI/ML applications in the cloud?What is the optimal data engineering workflow for building, testing, and operationalizing high-quality modern AI/ML applications in the cloud?What value does real-time replication play in migrating data and applications to modern cloud data architectures?What challenges do enterprises face in ensuring and maintaining the integrity, fitness, and quality of the data that they migrate to modern clouds?What tools and methodologies should enterprise application developers use to refactor and transform legacy data applications that have migrated to modern clouds
Your Data is Waiting. What are the Top 5 Trends for Data in 2022? (ASEAN)Denodo
Watch full webinar here: https://bit.ly/3saONRK
COVID-19 has pushed every industry and organization to embrace digital transformation at scale, upending the way many businesses will operate for the foreseeable future. Organizations no longer tolerate monolithic and centralized data architecture; they are embracing flexibility, modularity, and distributed data architecture to help drive innovation and modernize processes.
The pandemic has compelled organizations to accelerate their digital transformation initiatives and look for smarter and more agile ways to manage and leverage their corporate data assets. Data governance has become challenging in the ever-increasing complexity and distributed nature of the data ecosystem. Interoperability, collaboration and trust in data are imperative for a business to succeed. Data needs to be easily accessible and fit for purpose.
In this session, Denodo experts will discuss 5 key trends that are expected to be top of mind for CIOs and CDOs;
- Distributed Data Environments
- Decision Intelligence
- Modern Data Architecture
- Composable Data & Analytics
- Hyper-personalized Experiences
Traditionally, data integration has meant compromise. No matter how rapidly data architects and developers could complete a project before its deadline, speed would always come at the expense of quality. On the other hand, if they focused on delivering a quality project, it would generally drag on for months thus exceeding its deadline. Finally, if the teams concentrated on both quality and rapid delivery, the costs would invariably exceed the budget. Regardless of which path you chose, the end result would be less than desirable. This led some experts to revisit the scope of data integration. This write up shall focus on the same issue.
Data Mesh is the decentralized architecture where your units of architecture is a domain driven data set that is treated as a product owned by domains or teams that most intimately know that data either creating it or they are consuming it and re-sharing it and allocated specific roles that have the accountability and the responsibility to provide that data as a product abstracting away complexity into infrastructure layer a self-serve infrastructure layer so that create these products more much more easily.
Presentation on Data Mesh: The paradigm shift is a new type of eco-system architecture, which is a shift left towards a modern distributed architecture in which it allows domain-specific data and views “data-as-a-product,” enabling each domain to handle its own data pipelines.
Data Mesh in Azure using Cloud Scale Analytics (WAF)Nathan Bijnens
This document discusses moving from a centralized data architecture to a distributed data mesh architecture. It describes how a data mesh shifts data management responsibilities to individual business domains, with each domain acting as both a provider and consumer of data products. Key aspects of the data mesh approach discussed include domain-driven design, domain zones to organize domains, treating data as products, and using this approach to enable analytics at enterprise scale on platforms like Azure.
Know whether cloud based storage or dedicated storage is best for your business IT infrastructure depending on our organization requirements. Check Netmagic’s outlooks.
Data lakes are central repositories that store large volumes of structured, unstructured, and semi-structured data. They are ideal for machine learning use cases and support SQL-based access and programmatic distributed data processing frameworks. Data lakes can store data in the same format as its source systems or transform it before storing it. They support native streaming and are best suited for storing raw data without an intended use case. Data quality and governance practices are crucial to avoid a data swamp. Data lakes enable end-users to leverage insights for improved business performance and enable advanced analytics.
Data and Application Modernization in the Age of the Cloudredmondpulver
Data modernization is key to unlocking the full potential of your IT investments, both on premises and in the cloud. Enterprises and organizations of all sizes rely on their data to power advanced analytics, machine learning, and artificial intelligence.
Yet the path to modernizing legacy data systems for the cloud is full of pitfalls that cost time, money, and resources. These issues include high hardware and staffing costs, difficulty moving data and analytical processes to cloud environments, and inadequate support for real-time use cases. These issues delay delivery timelines and increase costs, impacting the return on investment for new, cutting-edge applications.
Watch this webinar in which James Kobielus, TDWI senior research director for data management, explores how enterprises are modernizing their mainframe data and application infrastructures in the cloud to sustain innovation and drive efficiencies. Kobielus will engage John de Saint Phalle, senior product manager at Precisely, in a discussion that addresses the following key questions:
When should enterprises consider migrating and replicating all their data assets to modern public clouds vs. retaining some on-premises in hybrid deployments?How should enterprises modernize their legacy data and application infrastructures to unlock innovation and value in the age of cloud computing?What are the key investments that enterprises should make to modernize their data pipelines to deliver better AI/ML applications in the cloud?What is the optimal data engineering workflow for building, testing, and operationalizing high-quality modern AI/ML applications in the cloud?What value does real-time replication play in migrating data and applications to modern cloud data architectures?What challenges do enterprises face in ensuring and maintaining the integrity, fitness, and quality of the data that they migrate to modern clouds?What tools and methodologies should enterprise application developers use to refactor and transform legacy data applications that have migrated to modern clouds
Your Data is Waiting. What are the Top 5 Trends for Data in 2022? (ASEAN)Denodo
Watch full webinar here: https://bit.ly/3saONRK
COVID-19 has pushed every industry and organization to embrace digital transformation at scale, upending the way many businesses will operate for the foreseeable future. Organizations no longer tolerate monolithic and centralized data architecture; they are embracing flexibility, modularity, and distributed data architecture to help drive innovation and modernize processes.
The pandemic has compelled organizations to accelerate their digital transformation initiatives and look for smarter and more agile ways to manage and leverage their corporate data assets. Data governance has become challenging in the ever-increasing complexity and distributed nature of the data ecosystem. Interoperability, collaboration and trust in data are imperative for a business to succeed. Data needs to be easily accessible and fit for purpose.
In this session, Denodo experts will discuss 5 key trends that are expected to be top of mind for CIOs and CDOs;
- Distributed Data Environments
- Decision Intelligence
- Modern Data Architecture
- Composable Data & Analytics
- Hyper-personalized Experiences
Traditionally, data integration has meant compromise. No matter how rapidly data architects and developers could complete a project before its deadline, speed would always come at the expense of quality. On the other hand, if they focused on delivering a quality project, it would generally drag on for months thus exceeding its deadline. Finally, if the teams concentrated on both quality and rapid delivery, the costs would invariably exceed the budget. Regardless of which path you chose, the end result would be less than desirable. This led some experts to revisit the scope of data integration. This write up shall focus on the same issue.
Data Mesh is the decentralized architecture where your units of architecture is a domain driven data set that is treated as a product owned by domains or teams that most intimately know that data either creating it or they are consuming it and re-sharing it and allocated specific roles that have the accountability and the responsibility to provide that data as a product abstracting away complexity into infrastructure layer a self-serve infrastructure layer so that create these products more much more easily.
This document discusses Accenture's methodology for migrating enterprise data platforms to the cloud at scale. It involves establishing a transformation office, standing up the target cloud data platform, migrating data and code in waves with change management, updating skills and operating models, implementing new governance, and decommissioning legacy systems. The key steps are developing a business case and migration strategy through discovery, planning the technology architecture and migration approach, and executing the migration while validating data and code through proofs of concept and migration waves.
Lecture4 big data technology foundationshktripathy
The document discusses big data architecture and its components. It explains that big data architecture is needed when analyzing large datasets over 100GB in size or when processing massive amounts of structured and unstructured data from multiple sources. The architecture consists of several layers including data sources, ingestion, storage, physical infrastructure, platform management, processing, query, security, monitoring, analytics and visualization. It provides details on each layer and their functions in ingesting, storing, processing and analyzing large volumes of diverse data.
This document discusses Saxo Bank's plans to implement a data governance solution called the Data Workbench. The Data Workbench will consist of a Data Catalogue and Data Quality Solution to provide transparency into Saxo's data ecosystem and improve data quality. The Data Catalogue will be built using LinkedIn's open source DataHub tool, which provides a metadata search and UI. The Data Quality Solution will use Great Expectations to define and monitor data quality rules. The document discusses why a decentralized, domain-driven approach is needed rather than a centralized solution, and how the Data Workbench aims to establish governance while staying lean and iterative.
Large Bank Leverages the Denodo Platform as the Foundation for a Shift to a D...Denodo
More info here: https://bit.ly/3WtXm5L
Every day, this bank serves customers across 12 U.S. states with reliable, state-of-the art banking services. Increasingly, this has required the bank to seamlessly reach customers and potential customers across multiple channels including in-person, ATM, mobile, and online channels, with digital services running 24/7 under strict levels of speed, availability, and accuracy. To address these needs, the bank planned a digital transformation that included a re-design of the entire data infrastructure.
The bank wanted to move from a traditional, centralized data infrastructure to a modern, decentralized, data mesh infrastructure. The bank leveraged the Denodo Platform as a foundational element for this profound transformation. With the Denodo Platform, the bank established a logical data fabric across the bank’s existing data infrastructure, which enables real-time access to any data source in the organization, without the data consumer needing to know where the source system is located.
The Denodo Platform enabled the bank to meet or exceed all of the company’s expectations for a new data-meshbased infrastructure that could support the bank’s vision for a digital transformation, including robust, domain-driven data marketplaces with data catalog capabilities.
Why Data Mesh Needs Data Virtualization (ASEAN)Denodo
This document provides an agenda and overview for a lunch and learn session on how data virtualization can enable a data mesh architecture. The session will discuss what a data mesh is, how it addresses challenges with centralized data management, and how data virtualization tools allow domains to create and manage their own data products while maintaining governance. It highlights how data virtualization maintains domain autonomy, provides self-serve capabilities, and enables federated computational governance in a data mesh. The presentation will demonstrate Denodo's data virtualization platform and discuss why a data lake alone may not be sufficient for a data mesh, as data virtualization offers more flexibility and reuse.
Cloud Data Management: The Future of Data Storage and ManagementFredReynolds2
Data is the essence of any business. It provides the organization, its people, and its customer’s timely and historical decision support. Data management’s importance must be considered. To maximize the benefits of cloud data management, businesses must first establish a mechanism for separating master data from other data types. Due diligence is required when choosing a data management platform and a data management system. Here, the potential of Cloud based Data Management emerges, enhancing the significance of these decisions.
The document discusses Microsoft's approach to implementing a data mesh architecture using their Azure Data Fabric. It describes how the Fabric can provide a unified foundation for data governance, security, and compliance while also enabling business units to independently manage their own domain-specific data products and analytics using automated data services. The Fabric aims to overcome issues with centralized data architectures by empowering lines of business and reducing dependencies on central teams. It also discusses how domains, workspaces, and "shortcuts" can help virtualize and share data across business units and data platforms while maintaining appropriate access controls and governance.
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeSG Analytics
The new data technologies, along with legacy infrastructure, are driving market-driven innovations like personalized offers, real-time alerts, and predictive maintenance. However, these technical additions - ranging from data lakes to analytics platforms to stream processing and data mesh —have increased the complexity of data architectures. They are significantly hampering the ongoing ability of an organization to deliver new capabilities while ensuring the integrity of artificial intelligence (AI) models. https://us.sganalytics.com/blog/evolving-big-data-strategies-with-data-lakehouses-and-data-mesh/
AtomicDB is a proprietary software technology that uses an n-dimensional associative memory system instead of a traditional table-based database. This allows information to be stored and related in a way analogous to human memory. The technology does not require extensive programming and can rapidly build and modify information systems to meet evolving needs. It provides significant cost and performance advantages over traditional databases for managing complex, relational data.
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...Denodo
Watch full webinar here: https://buff.ly/46pRfV7
This Denodo session explores the power of data virtualization, shedding light on its architecture, customer value, and a diverse range of use cases. Attendees will discover how the Denodo Platform enables seamless connectivity to various data sources while effortlessly combining, cleansing, and delivering data through 5 differentiated use cases.
Architecture: Delve into the core architecture of the Denodo Platform and learn how it empowers organizations to create a unified virtual data layer. Understand how data is accessed, integrated, and delivered in a real-time, agile manner.
Value for the Customer: Explore the tangible benefits that Denodo offers to its customers. From cost savings to improved decision-making, discover how the Denodo Platform helps organizations derive maximum value from their data assets.
Five Different Use Cases: Uncover five real-world use cases where Denodo's data virtualization platform has made a significant impact. From data governance to analytics, Denodo proves its versatility across a variety of domains.
- Logical Data Fabric
- Self Service Analytics
- Data Governance
- 360 degree of Entities
- Hybrid/Multi-Cloud Integration
Watch this illuminating session to gain insights into the transformative capabilities of the Denodo Platform.
All business sizes can benefit from better use of their data to gain insights, how the cloud can help overcome common data challenges and accelerate transformation with the cloud technology
https://www.rapyder.com/cloud-data-analytics-services/
A data-driven organization has an edge over its competitors. Information is sourced from endless sources with the growing popularity of factors, such as mobility, IoT, and cloud computing. As such, it has become quite a challenging task to manage various data types stored across a variety of repositories.
Today, Enterprise Data Fabric has come up as a crucial tactic for sharing of diverse, distributed, and dynamic records and frictionless access. The strategy provides a robust solution for high-cost and low-value integration cycles. It also meets the increasing demand for information sharing on real-time.
Three reasons why data virtualization is poised to play a key role in data management:
1) Data management challenges are increasing due to needs for quick response times, large and diverse data sources like social media and sensors, and many data management tools.
2) Data virtualization can address these challenges by providing a unified, secure access layer and delivering data as a service to meet business needs.
3) Data virtualization allows for a hybrid data storage model with data stored in both data warehouses and cheaper storage like Hadoop, and provides a common way to access both through its virtualization layer.
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
Watch full webinar here: https://bit.ly/3rwWhyv
The Data Mesh architectural design was first proposed in 2019 by Zhamak Dehghani, principal technology consultant at Thoughtworks, a technology company that is closely associated with the development of distributed agile methodology. A data mesh is a distributed, de-centralized data infrastructure in which multiple autonomous domains manage and expose their own data, called “data products,” to the rest of the organization.
Organizations leverage data mesh architecture when they experience shortcomings in highly centralized architectures, such as the lack domain-specific expertise in data teams, the inflexibility of centralized data repositories in meeting the specific needs of different departments within large organizations, and the slow nature of centralized data infrastructures in provisioning data and responding to changes.
In this session, Pablo Alvarez, Global Director of Product Management at Denodo, explains how data virtualization is your best bet for implementing an effective data mesh architecture.
You will learn:
- How data mesh architecture not only enables better performance and agility, but also self-service data access
- The requirements for “data products” in the data mesh world, and how data virtualization supports them
- How data virtualization enables domains in a data mesh to be truly autonomous
- Why a data lake is not automatically a data mesh
- How to implement a simple, functional data mesh architecture using data virtualization
Organizations today face massive data growth and must choose between dedicated storage systems or cloud-based storage. There are pros and cons to each. Dedicated storage offers more control over data but requires infrastructure investment, while cloud storage provides scalability and flexibility at a lower cost but with less control. The best choice depends on an organization's unique needs, such as data security, compliance requirements, workload performance needs, and cost factors. The document provides details on how different data types and importance levels may be best suited for different storage technologies.
Govern and Protect Your End User InformationDenodo
Watch this Fast Data Strategy session with speakers Clinton Cohagan, Chief Enterprise Data Architect, Lawrence Livermore National Lab & Nageswar Cherukupalli, Vice President & Group Manager, Infosys here: https://buff.ly/2k8f8M5
In its recent report “Predictions 2018: A year of reckoning”, Forrester predicts that 80% of firms affected by GDPR will not comply with the regulation by May 2018. Of those noncompliant firms, 50% will intentionally not comply.
Compliance doesn’t have to be this difficult! What if you have an opportunity to facilitate compliance with a mature technology and significant cost reduction? Data virtualization is a mature, cost-effective technology that enables privacy by design to facilitate compliance.
Attend this session to learn:
• How data virtualization provides a compliance foundation with data catalog, auditing, and data security.
• How you can enable single enterprise-wide data access layer with guardrails.
• Why data virtualization is a must-have capability for compliance use cases.
• How Denodo’s customers have facilitated compliance.
The document discusses the need for converged backup solutions that can simplify and consolidate data protection across mixed server environments. It notes that individual vendor solutions often only address specific proprietary platforms. An optimal solution is a cross-platform approach using intelligent converged backup that applies appropriate data protection services based on each data set's criticality. The document then introduces Storage Director by Tributary Systems as a policy-based data management solution that connects any host to any storage technology and applies services to data based on business importance. Storage Director allows for data backup consolidation and virtualization across heterogeneous environments.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
This document discusses Accenture's methodology for migrating enterprise data platforms to the cloud at scale. It involves establishing a transformation office, standing up the target cloud data platform, migrating data and code in waves with change management, updating skills and operating models, implementing new governance, and decommissioning legacy systems. The key steps are developing a business case and migration strategy through discovery, planning the technology architecture and migration approach, and executing the migration while validating data and code through proofs of concept and migration waves.
Lecture4 big data technology foundationshktripathy
The document discusses big data architecture and its components. It explains that big data architecture is needed when analyzing large datasets over 100GB in size or when processing massive amounts of structured and unstructured data from multiple sources. The architecture consists of several layers including data sources, ingestion, storage, physical infrastructure, platform management, processing, query, security, monitoring, analytics and visualization. It provides details on each layer and their functions in ingesting, storing, processing and analyzing large volumes of diverse data.
This document discusses Saxo Bank's plans to implement a data governance solution called the Data Workbench. The Data Workbench will consist of a Data Catalogue and Data Quality Solution to provide transparency into Saxo's data ecosystem and improve data quality. The Data Catalogue will be built using LinkedIn's open source DataHub tool, which provides a metadata search and UI. The Data Quality Solution will use Great Expectations to define and monitor data quality rules. The document discusses why a decentralized, domain-driven approach is needed rather than a centralized solution, and how the Data Workbench aims to establish governance while staying lean and iterative.
Large Bank Leverages the Denodo Platform as the Foundation for a Shift to a D...Denodo
More info here: https://bit.ly/3WtXm5L
Every day, this bank serves customers across 12 U.S. states with reliable, state-of-the art banking services. Increasingly, this has required the bank to seamlessly reach customers and potential customers across multiple channels including in-person, ATM, mobile, and online channels, with digital services running 24/7 under strict levels of speed, availability, and accuracy. To address these needs, the bank planned a digital transformation that included a re-design of the entire data infrastructure.
The bank wanted to move from a traditional, centralized data infrastructure to a modern, decentralized, data mesh infrastructure. The bank leveraged the Denodo Platform as a foundational element for this profound transformation. With the Denodo Platform, the bank established a logical data fabric across the bank’s existing data infrastructure, which enables real-time access to any data source in the organization, without the data consumer needing to know where the source system is located.
The Denodo Platform enabled the bank to meet or exceed all of the company’s expectations for a new data-meshbased infrastructure that could support the bank’s vision for a digital transformation, including robust, domain-driven data marketplaces with data catalog capabilities.
Why Data Mesh Needs Data Virtualization (ASEAN)Denodo
This document provides an agenda and overview for a lunch and learn session on how data virtualization can enable a data mesh architecture. The session will discuss what a data mesh is, how it addresses challenges with centralized data management, and how data virtualization tools allow domains to create and manage their own data products while maintaining governance. It highlights how data virtualization maintains domain autonomy, provides self-serve capabilities, and enables federated computational governance in a data mesh. The presentation will demonstrate Denodo's data virtualization platform and discuss why a data lake alone may not be sufficient for a data mesh, as data virtualization offers more flexibility and reuse.
Cloud Data Management: The Future of Data Storage and ManagementFredReynolds2
Data is the essence of any business. It provides the organization, its people, and its customer’s timely and historical decision support. Data management’s importance must be considered. To maximize the benefits of cloud data management, businesses must first establish a mechanism for separating master data from other data types. Due diligence is required when choosing a data management platform and a data management system. Here, the potential of Cloud based Data Management emerges, enhancing the significance of these decisions.
The document discusses Microsoft's approach to implementing a data mesh architecture using their Azure Data Fabric. It describes how the Fabric can provide a unified foundation for data governance, security, and compliance while also enabling business units to independently manage their own domain-specific data products and analytics using automated data services. The Fabric aims to overcome issues with centralized data architectures by empowering lines of business and reducing dependencies on central teams. It also discusses how domains, workspaces, and "shortcuts" can help virtualize and share data across business units and data platforms while maintaining appropriate access controls and governance.
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeSG Analytics
The new data technologies, along with legacy infrastructure, are driving market-driven innovations like personalized offers, real-time alerts, and predictive maintenance. However, these technical additions - ranging from data lakes to analytics platforms to stream processing and data mesh —have increased the complexity of data architectures. They are significantly hampering the ongoing ability of an organization to deliver new capabilities while ensuring the integrity of artificial intelligence (AI) models. https://us.sganalytics.com/blog/evolving-big-data-strategies-with-data-lakehouses-and-data-mesh/
AtomicDB is a proprietary software technology that uses an n-dimensional associative memory system instead of a traditional table-based database. This allows information to be stored and related in a way analogous to human memory. The technology does not require extensive programming and can rapidly build and modify information systems to meet evolving needs. It provides significant cost and performance advantages over traditional databases for managing complex, relational data.
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...Denodo
Watch full webinar here: https://buff.ly/46pRfV7
This Denodo session explores the power of data virtualization, shedding light on its architecture, customer value, and a diverse range of use cases. Attendees will discover how the Denodo Platform enables seamless connectivity to various data sources while effortlessly combining, cleansing, and delivering data through 5 differentiated use cases.
Architecture: Delve into the core architecture of the Denodo Platform and learn how it empowers organizations to create a unified virtual data layer. Understand how data is accessed, integrated, and delivered in a real-time, agile manner.
Value for the Customer: Explore the tangible benefits that Denodo offers to its customers. From cost savings to improved decision-making, discover how the Denodo Platform helps organizations derive maximum value from their data assets.
Five Different Use Cases: Uncover five real-world use cases where Denodo's data virtualization platform has made a significant impact. From data governance to analytics, Denodo proves its versatility across a variety of domains.
- Logical Data Fabric
- Self Service Analytics
- Data Governance
- 360 degree of Entities
- Hybrid/Multi-Cloud Integration
Watch this illuminating session to gain insights into the transformative capabilities of the Denodo Platform.
All business sizes can benefit from better use of their data to gain insights, how the cloud can help overcome common data challenges and accelerate transformation with the cloud technology
https://www.rapyder.com/cloud-data-analytics-services/
A data-driven organization has an edge over its competitors. Information is sourced from endless sources with the growing popularity of factors, such as mobility, IoT, and cloud computing. As such, it has become quite a challenging task to manage various data types stored across a variety of repositories.
Today, Enterprise Data Fabric has come up as a crucial tactic for sharing of diverse, distributed, and dynamic records and frictionless access. The strategy provides a robust solution for high-cost and low-value integration cycles. It also meets the increasing demand for information sharing on real-time.
Three reasons why data virtualization is poised to play a key role in data management:
1) Data management challenges are increasing due to needs for quick response times, large and diverse data sources like social media and sensors, and many data management tools.
2) Data virtualization can address these challenges by providing a unified, secure access layer and delivering data as a service to meet business needs.
3) Data virtualization allows for a hybrid data storage model with data stored in both data warehouses and cheaper storage like Hadoop, and provides a common way to access both through its virtualization layer.
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
Watch full webinar here: https://bit.ly/3rwWhyv
The Data Mesh architectural design was first proposed in 2019 by Zhamak Dehghani, principal technology consultant at Thoughtworks, a technology company that is closely associated with the development of distributed agile methodology. A data mesh is a distributed, de-centralized data infrastructure in which multiple autonomous domains manage and expose their own data, called “data products,” to the rest of the organization.
Organizations leverage data mesh architecture when they experience shortcomings in highly centralized architectures, such as the lack domain-specific expertise in data teams, the inflexibility of centralized data repositories in meeting the specific needs of different departments within large organizations, and the slow nature of centralized data infrastructures in provisioning data and responding to changes.
In this session, Pablo Alvarez, Global Director of Product Management at Denodo, explains how data virtualization is your best bet for implementing an effective data mesh architecture.
You will learn:
- How data mesh architecture not only enables better performance and agility, but also self-service data access
- The requirements for “data products” in the data mesh world, and how data virtualization supports them
- How data virtualization enables domains in a data mesh to be truly autonomous
- Why a data lake is not automatically a data mesh
- How to implement a simple, functional data mesh architecture using data virtualization
Organizations today face massive data growth and must choose between dedicated storage systems or cloud-based storage. There are pros and cons to each. Dedicated storage offers more control over data but requires infrastructure investment, while cloud storage provides scalability and flexibility at a lower cost but with less control. The best choice depends on an organization's unique needs, such as data security, compliance requirements, workload performance needs, and cost factors. The document provides details on how different data types and importance levels may be best suited for different storage technologies.
Govern and Protect Your End User InformationDenodo
Watch this Fast Data Strategy session with speakers Clinton Cohagan, Chief Enterprise Data Architect, Lawrence Livermore National Lab & Nageswar Cherukupalli, Vice President & Group Manager, Infosys here: https://buff.ly/2k8f8M5
In its recent report “Predictions 2018: A year of reckoning”, Forrester predicts that 80% of firms affected by GDPR will not comply with the regulation by May 2018. Of those noncompliant firms, 50% will intentionally not comply.
Compliance doesn’t have to be this difficult! What if you have an opportunity to facilitate compliance with a mature technology and significant cost reduction? Data virtualization is a mature, cost-effective technology that enables privacy by design to facilitate compliance.
Attend this session to learn:
• How data virtualization provides a compliance foundation with data catalog, auditing, and data security.
• How you can enable single enterprise-wide data access layer with guardrails.
• Why data virtualization is a must-have capability for compliance use cases.
• How Denodo’s customers have facilitated compliance.
The document discusses the need for converged backup solutions that can simplify and consolidate data protection across mixed server environments. It notes that individual vendor solutions often only address specific proprietary platforms. An optimal solution is a cross-platform approach using intelligent converged backup that applies appropriate data protection services based on each data set's criticality. The document then introduces Storage Director by Tributary Systems as a policy-based data management solution that connects any host to any storage technology and applies services to data based on business importance. Storage Director allows for data backup consolidation and virtualization across heterogeneous environments.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Zilliz
Join us to introduce Milvus Lite, a vector database that can run on notebooks and laptops, share the same API with Milvus, and integrate with every popular GenAI framework. This webinar is perfect for developers seeking easy-to-use, well-integrated vector databases for their GenAI apps.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
2. much in the same way as DevOps. Data ownership and responsibility
fall to these domains. They become the foundations of a mesh, resulting
in a domain‑driven distributed architecture.
Data mesh also requires a shift in organizational culture. For many
organizations this means a move from a centralized decision‑making
around governance to a federated model, for example, built on
cross‑organizational trust.
Reconsidering how data is distributed
Enterprises are increasingly pursuing data democratization – making
trusted, quality data available to everyone in the organization for smart
decision‑making and, at the same time, increasing productivity and
efficiencies to achieve business outcomes rapidly. Data mesh delivers
this using several principles:
■ Rethinking data as a product
■ Leveraging a domain‑oriented self‑service design
■ Supporting distributed domain‑specific data consumers
A data‑driven world requires
cultural and technical change
Enterprises understand the importance of being a
data‑driven organization. The benefits of intelligence
harvested from big data, include hyper‑personalization,
smart decision making, new business opportunities
and faster innovation. But it isn’t as easy as it sounds.
Many enterprises have invested in data platforms, especially large, centralized
data lakes, to achieve the data‑driven dream. Many, however, have been
disappointed by the results. Data lakes don’t scale well to meet changing
organizational and process requirements. In addition, there is often a lack
of alignment between the data lake creators and business teams, making
it difficult to get any tangible value. Data lakes also hold data in a host of
formats, which makes it a colossal task to make them available for usage,
while keeping the quality at a necessary level.
It is also important to note that becoming a data‑driven organization requires
cultural change in addition to technological implementation. Shortcomings
in organizational culture have been the main stumbling blocks to being
successful in the digital age.
Data mesh: the next data platform
Data promises to help solve these issues. Instead of one large body of data,
data mesh deconstructs it into distributed services built around a business
node or domain capabilities. Because there is no centralized data function,
data mesh supports decentralized ownership of domain‑related data.
Teams operate independently and autonomously as cross‑functional units,
2
Introduction Approach Architecture Governance Platform Models Results Why Orange
3. 3
Rethinking data as a product
Rethinking data as a product is about changing
organizational, architectural, and technological
concepts to get the most out of data, data teams
and data consumers.
Often data is seen as an asset – something valuable an organization
or part of an organization is not willing to part with. However, rethinking
data as a product creates more value by enabling data sharing and
data democratization. Often this approach makes a cultural change
necessary. In the data mesh approach, product teams own, control,
and are accountable for the data they create and share.
Data mesh creates an ecosystem of data products, as opposed to a
large, centralized data lake. The teams responsible for the data include the
producers, data scientists and engineers, business analysts, while other
users are seen as the customers for the data. With this cross‑functional
composition, teams include business and domain knowledge along with
engineering expertise to realize these data products.
Self‑service approach
For teams to autonomously work and take ownership of their data
products, they require a simple and efficient way of managing the
lifecycle of data and its provisioning. This is where self‑service
infrastructure as a platform comes in. It supports domain autonomy
and allows teams to create and disseminate valuable data by providing
dedicated and highly standardized domain environments. These are
ultimately the nodes of the data mesh. Again, this helps the domain’s
data ownership by underpinning it with secure and governed access.
3
Introduction Approach Architecture Governance Models
Platform Results Why Orange
4. Self‑service simplifies data access, breaks down silos, and enables the
scaled‑up sharing of live data. The infrastructure as a platform provides
dedicated, standardized domain environments with all necessary
components (such as storage or compute resources) a domain needs
to implement their use case. This ensures that domains can focus on
their business problem by not managing and maintaining the underlying
infrastructure. Domains are tasked with collecting, managing, and curating
data so that business intelligence applications can use it, for example.
Advantages of virtualization
Separating and abstracting the software from the underlying hardware
creates many possibilities, but two are especially important.
The first is that commodity equipment using more open technology can
replace the expensive proprietary hardware that these products used
in the past. The software then runs on that hardware in a virtualized
environment. That makes the costly, inflexible hardware component
cheaper and easier to support, leaving the real value in the software.
The second possibility is the management of that software.
Administrators can control that virtualized software centrally from
a dashboard, bringing the same centralized configuration and control
capabilities to the entire infrastructure. This also makes it a lot simpler
to enable automation and orchestration capabilities with proven IT
efficiency and cost optimization benefits.
Interoperability, standardization, and
governance
Maintaining data standards are imperative to data quality and trust.
Every domain provides standardized interfaces to access their data
which allows effective collaboration. The data output created can be
helpful to more than one domain. Interoperability, standardization, and
governance allows for efficient cross‑domain collaboration at all levels,
providing more significant innovation potential. It also allows trusted
data to be offered as products across the enterprise.
It is essential to understand that this model requires both energy and
commitment. With a decentralized model, it is crucial to establish
governance and common standards that ensure data products are
trusted and interoperable going forward. Moving from a monolith
to a microservices model requires cultural change. It involves
reorganization and changes in how teams and users work together.
Without these changes, you can end up with a fragmented data
system that doesn’t work.
The effort, however, is worth it. The deconstructed data model brings
with it greater business agility, scalability, and accelerated time to
market. It also eliminates process complexities.
4
Introduction Approach Architecture Governance Models
Platform Results Why Orange
5. Data mesh in depth
Data mesh is an architectural paradigm that
opens up analytical data at scale. It provides an
organizational view of how to structure data, data
platforms, and decentralized teams.
Instead of having a central data lake and central engineering teams, a
data mesh consists of many data nodes (domains) that interact with each
other, but operate independently. It describes a distributed domain‑driven
and self‑service platform approach where data is treated as a product.
The concept is built around domain‑driven data decomposition, where
domains have full ownership of their data. A team includes both deep
business knowledge, such as product managers or domain experts,
and technical expertise, such as data engineers and data scientists.
They are responsible for managing a domain together. This enables the
team to consume, process, and serve data that closely matches the
consumer requirements.
Domains are no longer dependent on engineering teams implementing
their requirements. Instead, they can produce and consume data sets
by themselves, while loosely coupled to other instances within the
organization by following governance standards. In addition, domains can
benefit from each other by consuming the data sets created as required.
Taking responsibility for business cases
Each domain is responsible for domain‑related use cases or is involved in
solving a specific business problem. This approach is designed to ensure
high data quality as the data processing will be done by the team that has
5
Introduction Approach Architecture Governance Models
Platform Results Why Orange
6. Data governance and infrastructure
In addition, data mesh has central components for data governance and
infrastructure. Both components act as self‑service platforms to efficiently
support product owner workflows and eradicate friction when connecting
to different parts of the infrastructure.
The governance component ensures that domain data is consumable
across the organization. The data catalog holding and providing meta
information about each domain will also need to be publicly accessible.
Global standards are key to ensure interoperability between domains.
Domain API specifications, schemas, member, permissions, and so forth
will need to be provided in a standardized format.
Furthermore, domains can deploy a standard set of compute and storage
resources on a self‑service basis from a central infrastructure component.
This reduces engineering overheads, while allowing the domain to focus on
the actual data processing. The standard set of resources should enable
domains to implement batch or streaming use cases and connect internal or
external data sources. Note that a very high degree of automation is required
here to create pre‑configured and ready‑to‑use domain environments.
Both self‑service components combined enable domains to act
independently and thus more efficiently. In addition, the generation
of new domains and use cases should also be frictionless.
the most knowledge about a use case. In contrast, data lakes, built
on a centralized approach, often exhibit issues because:
The producers have the capabilities but are not motivated to fulfill
requirements as their output doesn’t relate to any particular use case.
The consumers are motivated but are dependent on the output
of the centralized engineering team for data and data quality.
The engineering team is responsible for every implementation
but has no specific domain knowledge.
In a data mesh, these data and data quality drivers are all placed within
each domain.
Data mesh is a decentralized data platform which makes it easier for
organizations to create new use cases and enables faster delivery of
new features. This is made possible because it allows domain teams
to act independently by utilizing the self‑service platforms, with better
understanding of the use case requirements.
6
1
2
3
Introduction Approach Architecture Governance Models
Platform Results Why Orange
7. A governance framework
In the context of treating data as a product,
the governance of data in a decentralized data
architecture is crucial. The key focus areas of
data governance include availability, usability,
consistency, data integrity, and data security.
The fact that a data mesh is a distributed domain‑driven architecture and has
a self‑service platform design makes the data governance even more critical.
We have designed a data governance layer that provides the functionalities
needed for all key focus areas of governing data mentioned before, including a
data catalog backend (data discovery API) and a service (domain information
service) covering the whole domain schema evolution and lifecycle. This
includes five steps as follows:
Registration
Whenever a domain joins the data mesh, it needs to be initially
registered. In this registration process, all basic information about the
domain is stored: data format, data schema, processed data in terms
of data lineage, processing steps, and other data quality indicators
provided by the domain.
Domain schema
A domain can change over time regarding the data format or
schema, its behavior, or the internally used data sources. This means
that domain information in the data catalog will need to be kept up
to date. For this reason, each domain must provide an endpoint
where this information can be retrieved (domain schema API). In our
architectural design, the domain information service (DIS) pulls this
data by using the provisioned endpoint of a specific domain
1
2
7
Introduction Approach Architecture Governance Models
Platform Results Why Orange
8. to store schema evolution information, etc. The DIS implements the
logic for registering, updating, and querying schemas within the data
catalog. It represents the service layer for the data discovery API and
executes every request to the data catalog.
Data discovery
Domain data is made discoverable by the data discovery API,
which makes use of the DIS. The data discovery API is technically
a backend that exposes and manages the DIS endpoints and all
domain APIs. Here, domain APIs access is controlled and restricted
to secure its data from unauthorized access.
The addressability of the data products can be achieved by
following global standards for access to data via endpoints and
data schema descriptions.
Interoperability
Interoperability and standardization of communications are one of
the fundamental pillars for building distributed systems. It is vital to
establish global standards regarding data fields, and the metadata
of the domain data, such as data provenance and data lineage.
This increases the quality of the service level objective around the
truthfulness of the data. This information is governed globally by
the DIS and stored in the data catalog.
Security
Secure access to the data mesh and its individual domains is a
mandatory standard in every data architecture. In our architectural
design, we assume that the data is accessible via REST endpoints.
The access to these endpoints is managed and secured by using
API management services.
3
4
5
8
Introduction Approach Architecture Governance Models
Platform Results Why Orange
9. Rapid deployment with
infrastructure‑as‑a‑platform
The primary purpose of the infrastructure platform
is to enable domains to immediately start working
on their use cases by utilizing predefined automated
infrastructure deployments.
The infrastructure platform consists of two components. One is the
provisioning service handling requests for new domain environments.
The other is code repositories containing all the automation code (IaC) for
core components and domain environments, and several CI/CD pipelines
for automated deployments. The IaC code for domain environments is
designed to be suitable for every domain.
As a self‑service platform, the provisioning service can be used by domains
to request new environments. The following resource types should be
automatically deployed into a domain’s target environment:
■ Secret/key management tool
■ Workflow management tools
■ Compute resources for large‑scale data processing (such as Spark)
■ Runnable container/serverless application code
■ Persistence backend, such as blob storage, SQL, or NoSQL solutions
■ Stream processing tools
■ Monitoring and alerting solutions
9
Introduction Approach Architecture Governance Models
Platform Results Why Orange
10. Where domains fit in
A domain is responsible for the data of a clearly definable problem area.
In doing so, it consumes data from one or more other domains or external
sources, processes it, and provides output data, which again is consumable
by other domains. The domain offers the schema for its output data. One
team should be in charge of both a domain and its data quality.
Furthermore, a well‑defined toolset belongs to every domain. With the help
of this, the domain can carry out all necessary work steps. This toolset
is provided by the infrastructure platform and can be requested by every
domain independently.
To avoid every domain data set being accessed differently, an abstraction
of the backend technology will be applied. Therefore, every domain must
implement a REST API – the domain API – to provide the requested data.
Once implemented, domains can register their API in the data discovery
API. The registration triggers a process that:
Stores the domain schema within the data catalog
Exposes the domain’s API endpoints centrally
in the data discovery API
This makes the domain schema discoverable and accessible to other
domains.
Other domains must utilize the API in question – otherwise, direct access
to the domain data will not be possible. This makes the backend technology
interchangeable and insignificant to other domains. The domain API
provides endpoints to retrieve the data (e.g. by date) and the current
schema. Create, update and delete endpoints should not necessarily be
provided. Cases where the schema changes over time are covered by the
DIS frequently querying the schema to update changes in the data catalog.
1
2
10
Introduction Approach Architecture Governance Models
Platform Results Why Orange
11. Open vs strict model
Two different approaches to managing domains are
possible in a data mesh: the open and strict models.
In brief, the open model gives domain teams as much freedom as possible.
The strict model supports domain teams in highly‑regulated environments
that cannot be changed. Both approaches have pros and cons, and of
course, hybrid solutions are feasible too. We discuss them in depth below.
Open model
In the open model, domains have no limitations in choosing their tools for
data processing and storing. In addition to the standard toolset deployed
by the infrastructure platform, further resources of every type can be added
by the team by customizing the infrastructure code.
But more importantly, storing and publishing output data is fully managed
by the domain itself. There is no central instance for storing the output
data in a predefined backend technology. Instead, the domain can decide
to use a blob container, SQL database, document store, etc. They can
choose between structure and naming conventions and only need to
make sure to expose their domain API. They have full ownership and
responsibility to ensure consistency between the exposed API and the
actual implementation.
This approach requires reliable and responsible domain teams to avoid
inconsistencies and data quality. It gives domains more freedom, reduces
implementation and automation effort within the platform team, and only
works in organizations with senior‑level domain teams. Because of the flexible
approach, it is suited to business users adopting big data and DataOps.
11
Introduction Approach Architecture Governance Models
Platform Results Why Orange
12. Strict model
The strict model predefines the whole domain environment without any
possibility of changing it. Domains have no access to their infrastructure
code, so they must stick to the standard set of resources. Furthermore,
their persistence layer is under central management.
With this approach, there is an area for each domain with strict
regulations and policies on where and how to store the data. Also, their
exposed domain API is regulated and controlled by a central validation
process. This ensures that domain API and implementation will always
be congruent.
This strict model requires a lot of implementation and automation effort
within the platform team and presupposes a very sophisticated data
mesh platform. On the other hand, it ensures high data quality and
consistency by design. This highly developed model is targeted towards
research institutions and advanced big data and analytics users.
Going down the data mesh route
Data is increasingly distributed in all enterprises. Now is a good time
for any enterprise that has moved to the cloud and is deploying
microservices to think data mesh. The concept allows for easier, more
efficient, small domain name components that enhance the user
experience and are key to a data‑driven organization.
12
Introduction Approach Architecture Governance Models
Platform Results Why Orange
13. Data mesh results
speak for themselves
Data mesh is a burgeoning paradigm in data
architecture that enables enterprises to take control
of large data and improve business outcomes.
By 2025, IDC maintains global data will grow 61% to 175 zettabytes 2025.1
The collection, integration, and governance of this data to gain valuable
business insight is increasingly complex.
For enterprises that require flexible access to their data to accelerate time
to market, data mesh’s democratized approach to data management
provides an ideal solution. The direct benefits for enterprises adopting
this model include:
■ Establishing global data governance guidelines that encourage teams to
produce and deliver high‑quality data in a standardized and reliable format
■ Eliminating the challenges of data availability, making discoverability
and accessibility easier in a secure and interoperable environment
■ Increasing agility with decentralized data operations and a self‑service
infrastructure
■ Allowing teams to operate in a more agile and independent way
to reduce time‑to‑market and deliver new data products faster
13
Introduction Approach Architecture Governance Models
Platform Results Why Orange