Data Management
Shailendra Mruthyunjayappa
https://www.linkedin.com/in/mruthyunjayappa/
5
Trends 2022
Key
Observations by a practitioner
Oct 2022
The dream of creating a
centralized “single
source of truth” has
remained just that, a
dream!
Shailendra Mruthyunjayappa
https://www.linkedin.com/in/mruthyunjayappa/
1 Implementing a stable and structured enterprise data repository,
especially in a dynamic and competitive industry, is an impossibility
due to: (a) divergence in design approach, data architecture,
ownership, security and usage patterns of the operational layer and
analytical layer; (b) continuously changing, and therefore,
continuously failing integration between the operational and
analytical layers.
On the one hand, building and maintaining a monolithic data
warehouse is infeasible since there are too many moving parts in
the system. On the other hand, creating a pure data lake requires
retaining huge volumes of raw data files and building further data
layers, making it cumbersome for data users to automate use cases.
New decentralized/ distributed/ federated concepts such as data
fabric, data mesh and data lakehouse are taking off, which can help:
(a) weave together data from multiple distributed/ interconnected
sources; (b) adopt a hybrid cloud-based architecture; (c) avoid
multiple data hops and complex integrations; (d) enable federated
data governance; and(e) stay agile and focus on effective and
efficient data management.
The new generation of data leaders are being more practical,
idealism is giving way to pragmatism, and the data management
world is moving from big data concepts to distributed-governed
data design.
Cloud adoption is
growing quickly, but
there are a lot of “ifs &
buts”
Shailendra Mruthyunjayappa
https://www.linkedin.com/in/mruthyunjayappa/
2 For over a decade, cloud has been touted to offer significant
benefits to enterprises, including cost savings, security, scalability,
redundancy, business continuity, maintainability, etc. These benefits
are not fully realized, and many limitations and challenges are yet to
be solved.
Current limitations and challenges include: (a) insufficient visibility
into cloud usage and performance, leading to difficulty in budgeting
and justifying cost-performance tradeoff; (b) readiness/
compatibility of applications on cloud platforms; (c) constraints in
real-time integrations across multiple environments; (d) risks of
data loss/ leakage, data theft, cybersecurity, etc.; (e) compliance to
privacy requirements and data governance; (f) increased external
dependency risk.
Some of the challenges listed above are amplified further for
enterprise data management and data driven applications.
Emergence of hybrid cloud architecture (that gives the ability to
combine applications on public cloud, private cloud and on-
premises), third-party multi-cloud monitoring tools (that makes it
possible to optimize and control cloud spending), and cloud data
management platforms (to collect, manage, govern, and use
enterprise data in a hybrid cloud environment) are helping
overcome some of these limitations, but are yet mature.
Data governance has
graduated from theory
to practice
Shailendra Mruthyunjayappa
https://www.linkedin.com/in/mruthyunjayappa/
3 Every aspiring data driven organization understands the importance
of data governance. Most of them embarked on enterprise data
governance programs, and for over 2 decades have realized little
benefits. The main reason for the struggle has been the
misalignment of objectives between the data governance program
leaders and end users. Data governance programs focused on
hypothetical benefits of centralized structure, control and
compliance, without delivering any value to end users.
A good data governance initiative should be built on user
empowerment and efficiency goals. Data governance should be a
practical framework based on a broad system of rules, and
processes for consistent usage of data, with clear accountability of
data ownership, quality, usage and security.
Of late, we are seeing organizations define data governance
objectives that focus on efficiency and facilitation of end use cases
with emphasis on line-of-business ownership and standardized
usage of data. The introduction of data regulations and the added
complexity of overlaying external data has pushed data governance
programs to be more agile and responsive to user, industry and
regulatory requirements.
Data governance success can be found by embracing data silos, and
tooling for distributed data management and shared data
governance.
The potential of AI/ML
in data management is
yet to be realized
Shailendra Mruthyunjayappa
https://www.linkedin.com/in/mruthyunjayappa/
4 When someone refers to AI/ML, you imagine very fancy use cases
such as lifting business growth, sophisticated user engagement,
advanced risk assessment and management, optimization of
productivity, etc. While these use cases get the most attention from
stakeholders and deliver noticeable impact to the business, several
potential applications of AI/ML in the data management domain are
overlooked.
AI/ML can be used to automate low level tasks and augment
complex tasks. Automation of low-level tasks could in the form of
anomaly detection, rectification in the case of recurring data issues,
monitoring and tuning for data volumes and workload balancing.
Augmentation of complex tasks can include exception management,
master data management, data cataloging, data governance, data
risks and controls.
Several of these AI/ML use cases can be built internally with small
investment and benefits can be quickly realized as part of ongoing
operations. Once a use case is proven on a particular dataset or
workflow, the same can be standardized and gradually rolled out
enterprise-wide.
There are several new age technologies such as data observability
platforms and data curation platforms that have embedded AI/ML
capabilities, which can help quickly bridge the gaps in the enterprise
data management framework.
DataOps is very real
and within reach of
enterprises
Shailendra Mruthyunjayappa
https://www.linkedin.com/in/mruthyunjayappa/
5 DevOps, which started as a movement to streamline the software
development and operations teams with aligned objectives, went
mainstream within the space of a few years and has become the
prevalent methodology for software development and maintenance.
The principles of DevOps and Agile methodologies applied to data
management, commonly referred to as DataOps, is fast catching up
within the community.
DataOps is intended to help organizations meet business and
regulatory requirements quickly. DataOps helps reduce the end-to-
end cycle time of analytics and insights by breaking down and
iterating on ideation, design, data sourcing, data preparation,
analysis, and deployment, ensuring high quality data and usable
insights are delivered to business users swiftly.
Whether there is a formalized DataOps setup or not, data analytics
teams have always come under pressure to deliver quick results to
the business users. By implementing DataOps, organizations can
create an environment where processing time is reduced from days
to hours and hours to minutes, and bring focus to the highest
priority requirements.
DataOps leads to a responsive data analytics team and create an
edge for the business. The benefits are real and tangible.
For more such insights
follow me:
Shailendra Mruthyunjayappa
https://www.linkedin.com/in/mruthyunjayappa/

Data Management Trends 2022_Shailendra Mruthyunjayappa.pdf

  • 1.
  • 2.
    The dream ofcreating a centralized “single source of truth” has remained just that, a dream! Shailendra Mruthyunjayappa https://www.linkedin.com/in/mruthyunjayappa/ 1 Implementing a stable and structured enterprise data repository, especially in a dynamic and competitive industry, is an impossibility due to: (a) divergence in design approach, data architecture, ownership, security and usage patterns of the operational layer and analytical layer; (b) continuously changing, and therefore, continuously failing integration between the operational and analytical layers. On the one hand, building and maintaining a monolithic data warehouse is infeasible since there are too many moving parts in the system. On the other hand, creating a pure data lake requires retaining huge volumes of raw data files and building further data layers, making it cumbersome for data users to automate use cases. New decentralized/ distributed/ federated concepts such as data fabric, data mesh and data lakehouse are taking off, which can help: (a) weave together data from multiple distributed/ interconnected sources; (b) adopt a hybrid cloud-based architecture; (c) avoid multiple data hops and complex integrations; (d) enable federated data governance; and(e) stay agile and focus on effective and efficient data management. The new generation of data leaders are being more practical, idealism is giving way to pragmatism, and the data management world is moving from big data concepts to distributed-governed data design.
  • 3.
    Cloud adoption is growingquickly, but there are a lot of “ifs & buts” Shailendra Mruthyunjayappa https://www.linkedin.com/in/mruthyunjayappa/ 2 For over a decade, cloud has been touted to offer significant benefits to enterprises, including cost savings, security, scalability, redundancy, business continuity, maintainability, etc. These benefits are not fully realized, and many limitations and challenges are yet to be solved. Current limitations and challenges include: (a) insufficient visibility into cloud usage and performance, leading to difficulty in budgeting and justifying cost-performance tradeoff; (b) readiness/ compatibility of applications on cloud platforms; (c) constraints in real-time integrations across multiple environments; (d) risks of data loss/ leakage, data theft, cybersecurity, etc.; (e) compliance to privacy requirements and data governance; (f) increased external dependency risk. Some of the challenges listed above are amplified further for enterprise data management and data driven applications. Emergence of hybrid cloud architecture (that gives the ability to combine applications on public cloud, private cloud and on- premises), third-party multi-cloud monitoring tools (that makes it possible to optimize and control cloud spending), and cloud data management platforms (to collect, manage, govern, and use enterprise data in a hybrid cloud environment) are helping overcome some of these limitations, but are yet mature.
  • 4.
    Data governance has graduatedfrom theory to practice Shailendra Mruthyunjayappa https://www.linkedin.com/in/mruthyunjayappa/ 3 Every aspiring data driven organization understands the importance of data governance. Most of them embarked on enterprise data governance programs, and for over 2 decades have realized little benefits. The main reason for the struggle has been the misalignment of objectives between the data governance program leaders and end users. Data governance programs focused on hypothetical benefits of centralized structure, control and compliance, without delivering any value to end users. A good data governance initiative should be built on user empowerment and efficiency goals. Data governance should be a practical framework based on a broad system of rules, and processes for consistent usage of data, with clear accountability of data ownership, quality, usage and security. Of late, we are seeing organizations define data governance objectives that focus on efficiency and facilitation of end use cases with emphasis on line-of-business ownership and standardized usage of data. The introduction of data regulations and the added complexity of overlaying external data has pushed data governance programs to be more agile and responsive to user, industry and regulatory requirements. Data governance success can be found by embracing data silos, and tooling for distributed data management and shared data governance.
  • 5.
    The potential ofAI/ML in data management is yet to be realized Shailendra Mruthyunjayappa https://www.linkedin.com/in/mruthyunjayappa/ 4 When someone refers to AI/ML, you imagine very fancy use cases such as lifting business growth, sophisticated user engagement, advanced risk assessment and management, optimization of productivity, etc. While these use cases get the most attention from stakeholders and deliver noticeable impact to the business, several potential applications of AI/ML in the data management domain are overlooked. AI/ML can be used to automate low level tasks and augment complex tasks. Automation of low-level tasks could in the form of anomaly detection, rectification in the case of recurring data issues, monitoring and tuning for data volumes and workload balancing. Augmentation of complex tasks can include exception management, master data management, data cataloging, data governance, data risks and controls. Several of these AI/ML use cases can be built internally with small investment and benefits can be quickly realized as part of ongoing operations. Once a use case is proven on a particular dataset or workflow, the same can be standardized and gradually rolled out enterprise-wide. There are several new age technologies such as data observability platforms and data curation platforms that have embedded AI/ML capabilities, which can help quickly bridge the gaps in the enterprise data management framework.
  • 6.
    DataOps is veryreal and within reach of enterprises Shailendra Mruthyunjayappa https://www.linkedin.com/in/mruthyunjayappa/ 5 DevOps, which started as a movement to streamline the software development and operations teams with aligned objectives, went mainstream within the space of a few years and has become the prevalent methodology for software development and maintenance. The principles of DevOps and Agile methodologies applied to data management, commonly referred to as DataOps, is fast catching up within the community. DataOps is intended to help organizations meet business and regulatory requirements quickly. DataOps helps reduce the end-to- end cycle time of analytics and insights by breaking down and iterating on ideation, design, data sourcing, data preparation, analysis, and deployment, ensuring high quality data and usable insights are delivered to business users swiftly. Whether there is a formalized DataOps setup or not, data analytics teams have always come under pressure to deliver quick results to the business users. By implementing DataOps, organizations can create an environment where processing time is reduced from days to hours and hours to minutes, and bring focus to the highest priority requirements. DataOps leads to a responsive data analytics team and create an edge for the business. The benefits are real and tangible.
  • 7.
    For more suchinsights follow me: Shailendra Mruthyunjayappa https://www.linkedin.com/in/mruthyunjayappa/