Big Data and NoSQL continue to make headlines everywhere. However, most of what has been written about these topics is focused on the hardware, services, and scale out. But what about a Big Data and NoSQL Strategy, one that supports your business strategy? Virtually every major organization thinking about these data platforms is faced with the challenge of figuring out the appropriate approach and the requirements. This presentation will provide guidance on how to think about and establish realistic Big Data management plans and expectations. We will introduce a framework for evaluating the various choices when it comes to implementing and succeeding with Big Data/NoSQL and show how to demonstrate a sample use case.
Takeaways:
A Framework for evaluating Big Data techniques
Deciding on a Big Data platform – How do you know which one is a good fit for you?
The means by which big data techniques can complement existing data management practices
The prototyping nature of practicing big data techniques
The distinct ways in which utilizing Big Data can generate business value
Good data is like good water: best served fresh, and ideally well-filtered. Data Management strategies can produce tremendous procedural improvements and increased profit margins across the board, but only if the data being managed is of high quality. Determining how Data Quality should be engineered provides a useful framework for utilizing Data Quality management effectively in support of business strategy. This, in turn, allows for speedy identification of business problems, the delineation between structural and practice-oriented defects in Data Management, and proactive prevention of future issues. Organizations must realize what it means to utilize Data Quality engineering in support of business strategy. This webinar will illustrate how organizations with chronic business challenges often can trace the root of the problem to poor Data Quality. Showing how Data Quality should be engineered provides a useful framework in which to develop an effective approach. This, in turn, allows organizations to more quickly identify business problems as well as data problems caused by structural issues versus practice-oriented defects and prevent these from re-occurring.
DataEd Online: Data Architecture and Data Modeling Differences — Achieving a ...DATAVERSITY
<!-- wp:paragraph -->
<p>Many can be confused when it comes to data topics. Architecture, models, data — it can seem a bit overwhelming. This program offers a clear explanation of Data Modeling and Data Architecture with a focus on the power of their interdependence. Both Data Architecture and data models are made more useful by each other. Data models are a primary means to achieve a shared understanding of specific data challenges. They are literally the pages that intersect data assets and the organizational response. Data models, as documentation, are the currency of data coordination, used to verify integration, and are mandated input to any data systems evolution. Ideally, Data Architecture is the sum of the organizational data models. However, coverage is rarely complete. Anytime you are talking about architecture, it is important to include the complementary role of engineered data models. Developing these models often incorporates both forward and reverse perspectives. Only when working in a coordinated manner, can organizations take steps to better understand what they have and what they need to accomplish by employing Data Modeling and Data Architecture.</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p>This program's learning objectives include:</p>
<!-- /wp:paragraph -->
<!-- wp:list -->
<ul><li>Understanding the role played by models</li><li>Incorporating the interrelated concepts of architecture/engineering</li><li>What is taught: forward engineering with a goal of building</li><li>What is also needed: reverse engineering with a goal of understanding</li><li>How increasing coordination requirements increase design simplicity</li></ul>
<!-- /wp:list -->
DataEd Slides: Expressing Data Improvements as Business OutcomesDATAVERSITY
Join us and learn how you can better align your Data Management projects with business objectives to justify funding and gain management approval. Failure to successfully monetize Data Management investments sets up an unfortunate loop of fixing symptoms without addressing the underlying problems. As organizations begin to understand that data practices are the root causes of many business problems, they become more willing to make the required investments. However, we need to also approach them. The No. 1 reason that data programs fail to deliver is that they do not set or measure specific objectives that are meaningful to management. While there are opportunities to assist at the project level, data improvements are better able to be leveraged at the organization level. An improvable, dedicated data program can only be achieved by repeated application of data practices in service of specific business objectives. Data improvements typically do not maintain an ROI calculation. ROIs expressed in terms that board/executive management cares about deeply ensure data program viability. Improving organizational execution of specific data practice improvements must lead directly to specific improvements in organizational KPIs. While organizations may not be currently practiced in this ability, it is quite easy to learn. This presentation uses a number of specific examples calculating the business impact of data improvements. Program learning objectives include:
• Coming to grips with the state of practice
• Understanding the need for a comparable baseline measure
• Seeing application in a number of contexts
DataEd Slides: Data Management + Data Strategy = InteroperabilityDATAVERSITY
Few organizations operate without having to exchange data. (Many do it professionally and well!) The larger the data exchange burden (DEB), the greater the organizational overhead incurred. This death by 1,000 cuts must be factored into each organization’s calculations. Unfortunately, most organizations do not know if their organization’s DEB is great or small. A somewhat greater number of organizations have organized Data Management practices. Focusing Data Management efforts on increasing interoperability by decreasing the DEB friction is a good area to “practice.”
Learning Objectives:
• Gaining a good understanding of both important topics
• Understanding that data only operates at a very intricate, specifically dependent intent and what this means
• Understand state-of-the-practice
• Coordination is key, requiring necessary but insufficient interdependencies and sequencing
• Practice makes perfect
DataEd Slides: Exorcising the Seven Deadly Data SinsDATAVERSITY
The difficulty of implementing a new data strategy often goes under-appreciated, particularly the multi-faceted procedural challenges that need to be met while doing so. Deficiencies in organizational readiness and core competence represent clearly visible problems faced by data managers, but beyond that there are several cultural and structural barriers common to virtually all organizations that must be eliminated in order to facilitate effective management of data. This webinar will discuss these barriers – the titular “Seven Deadly Data Sins” – and in the process will also:
• Elaborate upon the three critical factors that lead to strategy failure
• Demonstrate a two-stage Data Strategy implementation process
• Explore the sources and rationales behind the “Seven Deadly Data Sins,” and recommend solutions
Data-Ed Online Webinar: Monetizing Data ManagementDATAVERSITY
Many data professionals struggle with the ability to demonstrate tangible returns on data management investments. In a webinar that is designed to appeal to both business and IT attendees, your presenter will describe multiple types of value produced through data-centric development and management practices. One of our examples, the healthcare space, offers the unique opportunity to demonstrate additional types of return on investment or value outcomes, namely returns in the form of lives saved through increased rates of Bone Marrow Donor matches. In addition to metrics around increasing revenues or decreasing costs, i.e. investments that directly impact an organization’s financial position, these additional statistics of lives saved can be used to justify data management and quality initiatives.
Takeaways:
Learn to think about data differently, in terms of how it can drive organizational needs. Data is not an IT solution but an information solution.
Take a broad view to ensure data sharing across organizational silos
Start small and go for quick wins: Build momentum and support
The first step towards understanding data assets impact on your organization is understanding what those assets mean for each other. Metadata – literally, data about data – is a practice area required by good systems development, and yet is also perhaps the most mislabeled and misunderstood Data Management practice. Understanding metadata and its associated technologies as more than just straightforward technological tools can provide powerful insight into the efficiency of organizational practices and enable you to combine practices into sophisticated techniques supporting larger and more complex business initiatives. Program learning objectives include:
• Understanding how to leverage metadata practices in support of business strategy
• Discussing foundational metadata concepts
• Exploring guiding principles for and lessons previously learned from metadata and its practical uses applied strategy
Good data is like good water: best served fresh, and ideally well-filtered. Data Management strategies can produce tremendous procedural improvements and increased profit margins across the board, but only if the data being managed is of high quality. Determining how Data Quality should be engineered provides a useful framework for utilizing Data Quality management effectively in support of business strategy. This, in turn, allows for speedy identification of business problems, the delineation between structural and practice-oriented defects in Data Management, and proactive prevention of future issues. Organizations must realize what it means to utilize Data Quality engineering in support of business strategy. This webinar will illustrate how organizations with chronic business challenges often can trace the root of the problem to poor Data Quality. Showing how Data Quality should be engineered provides a useful framework in which to develop an effective approach. This, in turn, allows organizations to more quickly identify business problems as well as data problems caused by structural issues versus practice-oriented defects and prevent these from re-occurring.
DataEd Online: Data Architecture and Data Modeling Differences — Achieving a ...DATAVERSITY
<!-- wp:paragraph -->
<p>Many can be confused when it comes to data topics. Architecture, models, data — it can seem a bit overwhelming. This program offers a clear explanation of Data Modeling and Data Architecture with a focus on the power of their interdependence. Both Data Architecture and data models are made more useful by each other. Data models are a primary means to achieve a shared understanding of specific data challenges. They are literally the pages that intersect data assets and the organizational response. Data models, as documentation, are the currency of data coordination, used to verify integration, and are mandated input to any data systems evolution. Ideally, Data Architecture is the sum of the organizational data models. However, coverage is rarely complete. Anytime you are talking about architecture, it is important to include the complementary role of engineered data models. Developing these models often incorporates both forward and reverse perspectives. Only when working in a coordinated manner, can organizations take steps to better understand what they have and what they need to accomplish by employing Data Modeling and Data Architecture.</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p>This program's learning objectives include:</p>
<!-- /wp:paragraph -->
<!-- wp:list -->
<ul><li>Understanding the role played by models</li><li>Incorporating the interrelated concepts of architecture/engineering</li><li>What is taught: forward engineering with a goal of building</li><li>What is also needed: reverse engineering with a goal of understanding</li><li>How increasing coordination requirements increase design simplicity</li></ul>
<!-- /wp:list -->
DataEd Slides: Expressing Data Improvements as Business OutcomesDATAVERSITY
Join us and learn how you can better align your Data Management projects with business objectives to justify funding and gain management approval. Failure to successfully monetize Data Management investments sets up an unfortunate loop of fixing symptoms without addressing the underlying problems. As organizations begin to understand that data practices are the root causes of many business problems, they become more willing to make the required investments. However, we need to also approach them. The No. 1 reason that data programs fail to deliver is that they do not set or measure specific objectives that are meaningful to management. While there are opportunities to assist at the project level, data improvements are better able to be leveraged at the organization level. An improvable, dedicated data program can only be achieved by repeated application of data practices in service of specific business objectives. Data improvements typically do not maintain an ROI calculation. ROIs expressed in terms that board/executive management cares about deeply ensure data program viability. Improving organizational execution of specific data practice improvements must lead directly to specific improvements in organizational KPIs. While organizations may not be currently practiced in this ability, it is quite easy to learn. This presentation uses a number of specific examples calculating the business impact of data improvements. Program learning objectives include:
• Coming to grips with the state of practice
• Understanding the need for a comparable baseline measure
• Seeing application in a number of contexts
DataEd Slides: Data Management + Data Strategy = InteroperabilityDATAVERSITY
Few organizations operate without having to exchange data. (Many do it professionally and well!) The larger the data exchange burden (DEB), the greater the organizational overhead incurred. This death by 1,000 cuts must be factored into each organization’s calculations. Unfortunately, most organizations do not know if their organization’s DEB is great or small. A somewhat greater number of organizations have organized Data Management practices. Focusing Data Management efforts on increasing interoperability by decreasing the DEB friction is a good area to “practice.”
Learning Objectives:
• Gaining a good understanding of both important topics
• Understanding that data only operates at a very intricate, specifically dependent intent and what this means
• Understand state-of-the-practice
• Coordination is key, requiring necessary but insufficient interdependencies and sequencing
• Practice makes perfect
DataEd Slides: Exorcising the Seven Deadly Data SinsDATAVERSITY
The difficulty of implementing a new data strategy often goes under-appreciated, particularly the multi-faceted procedural challenges that need to be met while doing so. Deficiencies in organizational readiness and core competence represent clearly visible problems faced by data managers, but beyond that there are several cultural and structural barriers common to virtually all organizations that must be eliminated in order to facilitate effective management of data. This webinar will discuss these barriers – the titular “Seven Deadly Data Sins” – and in the process will also:
• Elaborate upon the three critical factors that lead to strategy failure
• Demonstrate a two-stage Data Strategy implementation process
• Explore the sources and rationales behind the “Seven Deadly Data Sins,” and recommend solutions
Data-Ed Online Webinar: Monetizing Data ManagementDATAVERSITY
Many data professionals struggle with the ability to demonstrate tangible returns on data management investments. In a webinar that is designed to appeal to both business and IT attendees, your presenter will describe multiple types of value produced through data-centric development and management practices. One of our examples, the healthcare space, offers the unique opportunity to demonstrate additional types of return on investment or value outcomes, namely returns in the form of lives saved through increased rates of Bone Marrow Donor matches. In addition to metrics around increasing revenues or decreasing costs, i.e. investments that directly impact an organization’s financial position, these additional statistics of lives saved can be used to justify data management and quality initiatives.
Takeaways:
Learn to think about data differently, in terms of how it can drive organizational needs. Data is not an IT solution but an information solution.
Take a broad view to ensure data sharing across organizational silos
Start small and go for quick wins: Build momentum and support
The first step towards understanding data assets impact on your organization is understanding what those assets mean for each other. Metadata – literally, data about data – is a practice area required by good systems development, and yet is also perhaps the most mislabeled and misunderstood Data Management practice. Understanding metadata and its associated technologies as more than just straightforward technological tools can provide powerful insight into the efficiency of organizational practices and enable you to combine practices into sophisticated techniques supporting larger and more complex business initiatives. Program learning objectives include:
• Understanding how to leverage metadata practices in support of business strategy
• Discussing foundational metadata concepts
• Exploring guiding principles for and lessons previously learned from metadata and its practical uses applied strategy
Data-Ed Online: Approaching Data QualityDATAVERSITY
Good data is like good water: best served fresh, and ideally well-filtered. Data Management strategies can produce tremendous procedural improvements and increased profit margins across the board, but only if the data being managed is of high quality. Determining how Data Quality should be engineered provides a useful framework for utilizing Data Quality management effectively in support of business strategy. This, in turn, allows for speedy identification of business problems, the delineation between structural and practice-oriented defects in Data Management, and proactive prevention of future issues. Organizations must realize what it means to utilize Data Quality engineering in support of business strategy. This webinar will illustrate how organizations with chronic business challenges often can trace the root of the problem to poor Data Quality. Showing how Data Quality should be engineered provides a useful framework in which to develop an effective approach. This, in turn, allows organizations to more quickly identify business problems as well as data problems caused by structural issues versus practice-oriented defects and prevent these from re-occurring.
Learning Objectives:
Help you understand foundational Data Quality concepts based on the DAMA Guide to Data Management Book of Knowledge (DAMA DMBoK), as well as guiding principles, best practices, and steps for improving Data Quality at your organization
Demonstrate how chronic business challenges for organizations are often rooted in poor Data Quality
Share case studies illustrating the hallmarks and benefits of Data Quality success
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...DATAVERSITY
Developing a Data Strategy for your organization can seem like a daunting task. The opportunity in getting it right can be significant, however, as data drives many of the key initiatives in today’s marketplace from digital transformation, to marketing, to customer centricity, population health, and more. This webinar will help de-mystify data strategy and data architecture and will provide concrete, practical ways to get started.
Data-Ed: Show Me the Money: The Business Value of Data and ROIData Blueprint
This webinar originally aired on Tuesday, December 11, 2012. It is part of Data Blueprint's ongoing webinar series on data management with Dr. Aiken.
Sign up for future sessions at http://www.datablueprint.com/webinar-schedule.
Abstract:
Failure to successfully monetize data management investments sets up an unfortunate loop of fixing symptoms without addressing the underlying problems. As organizations begin to understand poor data management practices as the root causes of many of their problems, they become more willing to make the required investments in our profession. This presentation uses specific examples to illustrate the costs of poor data management. Join us and learn how you can apply similar tactics at your organization to justify funding and gain management approval.
Slides: How AI Makes Analytics More HumanDATAVERSITY
People think AI makes analytics less human, replacing human decision making. But the truth is, AI actually makes analytics more human. Augmented analytics are helping organizations finally break through the low levels of adoption and limitations typical of 2nd generation visualization tools.
Most business problems cannot be solved purely by algorithms or machine learning — they require human interaction and perspective. Uniting precedent-based machine learning systems with natural human intuition and curiosity is the foundation of 3rd generation BI and democratizing data across an enterprise.
It is a natural flow to enhance your data eco-system by deploying a platform with augmented intelligence to work alongside users in the pursuit of surfacing new insights, automating tasks, and supporting natural language interaction. All work as accelerators for achieving active intelligence and Data Literacy.
Real-World Data Governance: Metadata to Empower Data Stewards - Introducing t...DATAVERSITY
Metadata is the most valuable tool of the Data Steward. Where the stewards get their metadata and how they participate in the process of delivering core metadata is an issue organizations have been struggling with for years. The Operational Metadata Store or OMS may be the answer.
The traditional Operational Data Store or ODS is a database designed to integrate data from numerous sources that supports business operations and then feeds that data back into the operational systems. This Real-World Data Governance webinar with Bob Seiner and a panel of industry pundits will hold a lively discussion on the practicality of creating the ODS using metadata as the data, utilizing the metadata from a variety of existing sources to operationalize your data stewards.
The session will focus on:
Identifying the most significant metadata for your organization
Identifying existing sources of metadata – known and hidden
Identifying when that metadata will be most useful to your data stewards
Defining a lifecycle that encourages data steward participation
Delivering a model that incorporates all of the above
In that session we will discuss about Data Governance, mainly around that fantastic platform Power BI (but also around on-prem concerns).
How to avoid dataset-hell ? What are the best practices for sharing queries ? Who is the famous Data Steward and what is its role in a department or in the whole company ? How do you choose the right person ?
Keywords : Power Query, Data Management Gateway, Power BI Admin Center, Datastewardship, SharePoint 2013, eDiscovery
Level 200
Data-Ed Online Webinar: Data Architecture RequirementsDATAVERSITY
Data architecture is foundational to an information-based operational environment. It is data architecture that organizes data assets so they can be used in your business strategy to create real business value. Even though this is important, data architectures are still being used ineffectively. The various uses of data architecture are referenced to inform, clarify, understand, and resolve aspects of a variety of business problems commonly encountered in organizations. As opposed to showing how to architect data, your presenter Dr. Peter Aiken will show how to use data architecture to solve business problems. The goal is for you to be able to envision a number of uses for data architectures that will maximize your organization’s competitive advantage.
Takeaways:
•How to utilize data architecture to address a broad variety of organizational challenges and how to utilize data architectures in support of business strategy
•Understanding foundational data architecture concepts based on the DAMA DMBOK
•Data architecture guiding principles & best practices
Slides: How Automating Data Lineage Improves BI PerformanceDATAVERSITY
BI landscapes are becoming increasingly complex with the surge in adoption of cloud technologies. Your BI group may have one foot in legacy systems and the other in more modern cloud-based systems, and this alone makes managing and understanding your data virtually impossible.
From needing to understand the impact of a change in a source system from the ETL through to reporting, to finding the source of a reporting error that an end-user questioned you on, to quickly responding to auditors’ demands – these recurring daily BI tasks and more turn into weeks-long projects.
Join us for our upcoming webinar where you’ll learn:
• How to enable your BI group to fix problems sooner for quicker access to accurate data
• The advantages of moving from manual to automated data lineage
• Use cases for BI and analytics groups in a variety of industries, including finance and insurance
Data Systems Integration & Business Value Pt. 2: CloudDATAVERSITY
Certain systems are more data focused than others. Usually their primary focus is on accomplishing integration of disparate data. In these cases, failure is most often attributable to the adoption of a single pillar (silver bullet). The three webinars in the Data Systems Integration and Business Value series are designed to illustrate that good systems development more often depends on at least three DM disciplines (pie wedges) in order to provide a solid foundation.
DataEd Slides: Data Management vs. Data StrategyDATAVERSITY
Organizations across most industries make some attempt to utilize Data Management and data strategies. While most organizations have both concepts implemented, they must understand their required interoperability to fully achieve their goals.
Learning Objectives
• Gaining a good understanding of both important topics
• Understanding that data only operates at a very intricate, specifically dependent, intent and what this means
• Understand state-of-the-practice
• Coordination is key, requiring necessary but insufficient interdependencies and sequencing
• Practice makes perfect
RWDG Slides: Data Architecture Is Data GovernanceDATAVERSITY
Data Architecture and Data Governance are the same thing! Aren’t they?
Most people would say that this line of thinking is absurd — or even worse. There is NO WAY that they are the same thing. Or are they?
This RWDG webinar with Bob Seiner and his special guest Anthony Algmin looks at the disciplines of Data Governance and Data Architecture and explores how much they are the same … and how they are different. The speakers will let you draw your own conclusion, but they will get you thinking about whether Data Architecture and Data Governance are two sides of the same coin.
In this webinar, Bob and Anthony will discuss:
• What is meant by the saying two sides of the same coin … and how it relates
• The similarities between Data Architecture and Data Governance
• The differences between the two
• How to use Data Architecture to sell Data Governance … and the other way around
• Deciding if the two disciplines are the same … or different
Getting (Re)Started with Data StewardshipDATAVERSITY
In order to find value in your organization’s data assets, heroic data stewards are tasked with saving the day — every single day! Adhering to the organizational Data Governance (DG) framework, they work to ensure that data is captured right the first time, validated through appropriately automated means, and integrated into business processing. Whether it’s data profiling or in-depth root cause analysis, data stewards ensure the organization’s mission-critical data is reliably coordinated. This program will approach this framework and punctuate important facets of a data steward’s role.
RWDG Slides: Building Data Governance Through Data StewardshipDATAVERSITY
Data stewards play an important role in Data Governance solutions. That is why it is critical that organizations get data stewardship right when setting up their program. The data is governed by people. Some people will even tell you that the discipline should be called people governance.
Bob Seiner has a lot to say on this subject. In this RWDG webinar, Bob shares the reasons why you must build your Data Governance program through the stewardship of the data. There is no governance without formal accountability for data. People become stewards when their relationship to data is formalized. It is the only way.
This webinar will focus on:
• The definition of data stewardship that MUST be adopted
• The critical role stewardship plays in governing data
• What it means to formalize accountability
• Why everybody in the organization is a data steward
• How to build Data Governance through stewardship
Data Systems Integration & Business Value PT. 3: Warehousing Data Blueprint
Certain systems are more data focused than others. Usually their primary focus is on accomplishing integration of disparate data. In these cases, failure is most often attributable to the adoption of a single pillar (silver bullet). The three webinars in the Data Systems Integration and Business Value series are designed to illustrate that good systems development more often depends on at least three DM disciplines (pie wedges) in order to provide a solid foundation.
Integrating data across systems has been a perpetual challenge. Unfortunately, the current technology-focused solutions have not helped IT to improve its dismal project success statistics. Data warehouses, BI implementations, and general analytical efforts achieve the same levels of success as other IT projects – approximately 1/3rd are considered successes when measured against price, schedule, or functionality objectives. The first step is determining the appropriate analysis approach to the data system integration challenge. The second step is understanding the strengths and weaknesses of various approaches. Turns out that proper analysis at this stage makes actual technology selection far more accurate. Only when these are accomplished can proper matching between problem and capabilities be achieved as the third step and true business value be delivered.
Was Big Data worth it? We were promised a data revolution when Big Data and Hadoop exploded onto the scene – but those technologies brought with them ungoverned, underexploited, complex environments that didn’t solve the analytical problems they were supposed to. All is not lost, however. This webcast explores three important things we’ve learned from Big Data that can be applied to every kind of data environment: modern approaches to data that exploit the flexibility and power of Big Data without losing the governance and management our businesses need.
For over four decades, IT strategy has been about the alignment of technology with the needs of the “customer,” be it an organization, business, end user, or device. The most important part of system acquisition is deciding what to build or buy, as it is better to deliver no solution at all than it is to deliver the wrong solution. But there are two distinct dimensions to getting requirements and ensuring that they, and the IT solution that results, not only aligns with the business as it is, but is built in such a way that it can sustain that alignment in a cost-effective and time-efficient manner. Specifically, (1) narrow requirements, which focus on the short-term needs for specific parts, functions, or processes of the business; and, (2) broad requirements, which focus on a comprehensive, enterprise-wide approach with holistic and longer-range objectives like simplicity, suppleness, and total cost of ownership. We typically call these “Systems Analysis and Design” and “Enterprise Architecture” respectively. Ideally, organizations should be able to do both well, and effectively balance the inevitable tradeoffs between them. Sadly, in the vast majority of organizations, that is not yet the case.
Professor Kappelman will present the results of a ground-breaking study from the Society for Information Management (SIM) Enterprise Architecture Working Group that developed and validated measures for these two distinct types of requirements capabilities. Findings include:
• Empirical validation that there is, in fact, a difference between requirement capabilities in a narrow or individual system context (i.e., Systems Analysis and Design within the bounds of a specific development project), and requirements capabilities in a broad or enterprise context (i.e., Enterprise Architecture regarding how those individual systems fit together in an enterprise-wide strategic design).
• Strong evidence that requirements capabilities overall are immature, with narrow activities more mature than the corresponding broad enterprise capabilities.
• Solid evidence, based on fifteen years of studies, that software development capabilities are generally maturing, but are still fairly immature.
This research provides requirements engineers, software designers, software developers, and other IT practitioners with tools to assess their own requirements engineering and software development capabilities. and compare them with those of their peers. Suggestions for improvements are made.
Data-Ed Webinar: Data Modeling FundamentalsDATAVERSITY
Every organization produces and consumes data. Because data is so important to day to day operations, data trends are hitting the mainstream and businesses are adopting buzzwords such as Big Data, NoSQL, data scientist, etc., to seek solutions for their fundamental issues. Few realize that the importance of any solution, regardless of platform or technology, relies on the data model supporting it. Data modeling is not an optional task for an organization’s data effort. It is a vital activity that supports the solutions driving your business.
This webinar will address fundamental data modeling methodologies, as well as trends around the practice of data modeling itself. We will discuss abstract models and entity frameworks, as well as the general shift from data modeling being segmented to becoming more integrated with business practices.
Learning Objectives:
How are anchor modeling, data vault, etc. different and when should I apply them?
Integrating data models to business models and the value this creates
Application development (Data first, code first, object first)
Data-Ed Online: Approaching Data QualityDATAVERSITY
Good data is like good water: best served fresh, and ideally well-filtered. Data Management strategies can produce tremendous procedural improvements and increased profit margins across the board, but only if the data being managed is of high quality. Determining how Data Quality should be engineered provides a useful framework for utilizing Data Quality management effectively in support of business strategy. This, in turn, allows for speedy identification of business problems, the delineation between structural and practice-oriented defects in Data Management, and proactive prevention of future issues. Organizations must realize what it means to utilize Data Quality engineering in support of business strategy. This webinar will illustrate how organizations with chronic business challenges often can trace the root of the problem to poor Data Quality. Showing how Data Quality should be engineered provides a useful framework in which to develop an effective approach. This, in turn, allows organizations to more quickly identify business problems as well as data problems caused by structural issues versus practice-oriented defects and prevent these from re-occurring.
Learning Objectives:
Help you understand foundational Data Quality concepts based on the DAMA Guide to Data Management Book of Knowledge (DAMA DMBoK), as well as guiding principles, best practices, and steps for improving Data Quality at your organization
Demonstrate how chronic business challenges for organizations are often rooted in poor Data Quality
Share case studies illustrating the hallmarks and benefits of Data Quality success
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...DATAVERSITY
Developing a Data Strategy for your organization can seem like a daunting task. The opportunity in getting it right can be significant, however, as data drives many of the key initiatives in today’s marketplace from digital transformation, to marketing, to customer centricity, population health, and more. This webinar will help de-mystify data strategy and data architecture and will provide concrete, practical ways to get started.
Data-Ed: Show Me the Money: The Business Value of Data and ROIData Blueprint
This webinar originally aired on Tuesday, December 11, 2012. It is part of Data Blueprint's ongoing webinar series on data management with Dr. Aiken.
Sign up for future sessions at http://www.datablueprint.com/webinar-schedule.
Abstract:
Failure to successfully monetize data management investments sets up an unfortunate loop of fixing symptoms without addressing the underlying problems. As organizations begin to understand poor data management practices as the root causes of many of their problems, they become more willing to make the required investments in our profession. This presentation uses specific examples to illustrate the costs of poor data management. Join us and learn how you can apply similar tactics at your organization to justify funding and gain management approval.
Slides: How AI Makes Analytics More HumanDATAVERSITY
People think AI makes analytics less human, replacing human decision making. But the truth is, AI actually makes analytics more human. Augmented analytics are helping organizations finally break through the low levels of adoption and limitations typical of 2nd generation visualization tools.
Most business problems cannot be solved purely by algorithms or machine learning — they require human interaction and perspective. Uniting precedent-based machine learning systems with natural human intuition and curiosity is the foundation of 3rd generation BI and democratizing data across an enterprise.
It is a natural flow to enhance your data eco-system by deploying a platform with augmented intelligence to work alongside users in the pursuit of surfacing new insights, automating tasks, and supporting natural language interaction. All work as accelerators for achieving active intelligence and Data Literacy.
Real-World Data Governance: Metadata to Empower Data Stewards - Introducing t...DATAVERSITY
Metadata is the most valuable tool of the Data Steward. Where the stewards get their metadata and how they participate in the process of delivering core metadata is an issue organizations have been struggling with for years. The Operational Metadata Store or OMS may be the answer.
The traditional Operational Data Store or ODS is a database designed to integrate data from numerous sources that supports business operations and then feeds that data back into the operational systems. This Real-World Data Governance webinar with Bob Seiner and a panel of industry pundits will hold a lively discussion on the practicality of creating the ODS using metadata as the data, utilizing the metadata from a variety of existing sources to operationalize your data stewards.
The session will focus on:
Identifying the most significant metadata for your organization
Identifying existing sources of metadata – known and hidden
Identifying when that metadata will be most useful to your data stewards
Defining a lifecycle that encourages data steward participation
Delivering a model that incorporates all of the above
In that session we will discuss about Data Governance, mainly around that fantastic platform Power BI (but also around on-prem concerns).
How to avoid dataset-hell ? What are the best practices for sharing queries ? Who is the famous Data Steward and what is its role in a department or in the whole company ? How do you choose the right person ?
Keywords : Power Query, Data Management Gateway, Power BI Admin Center, Datastewardship, SharePoint 2013, eDiscovery
Level 200
Data-Ed Online Webinar: Data Architecture RequirementsDATAVERSITY
Data architecture is foundational to an information-based operational environment. It is data architecture that organizes data assets so they can be used in your business strategy to create real business value. Even though this is important, data architectures are still being used ineffectively. The various uses of data architecture are referenced to inform, clarify, understand, and resolve aspects of a variety of business problems commonly encountered in organizations. As opposed to showing how to architect data, your presenter Dr. Peter Aiken will show how to use data architecture to solve business problems. The goal is for you to be able to envision a number of uses for data architectures that will maximize your organization’s competitive advantage.
Takeaways:
•How to utilize data architecture to address a broad variety of organizational challenges and how to utilize data architectures in support of business strategy
•Understanding foundational data architecture concepts based on the DAMA DMBOK
•Data architecture guiding principles & best practices
Slides: How Automating Data Lineage Improves BI PerformanceDATAVERSITY
BI landscapes are becoming increasingly complex with the surge in adoption of cloud technologies. Your BI group may have one foot in legacy systems and the other in more modern cloud-based systems, and this alone makes managing and understanding your data virtually impossible.
From needing to understand the impact of a change in a source system from the ETL through to reporting, to finding the source of a reporting error that an end-user questioned you on, to quickly responding to auditors’ demands – these recurring daily BI tasks and more turn into weeks-long projects.
Join us for our upcoming webinar where you’ll learn:
• How to enable your BI group to fix problems sooner for quicker access to accurate data
• The advantages of moving from manual to automated data lineage
• Use cases for BI and analytics groups in a variety of industries, including finance and insurance
Data Systems Integration & Business Value Pt. 2: CloudDATAVERSITY
Certain systems are more data focused than others. Usually their primary focus is on accomplishing integration of disparate data. In these cases, failure is most often attributable to the adoption of a single pillar (silver bullet). The three webinars in the Data Systems Integration and Business Value series are designed to illustrate that good systems development more often depends on at least three DM disciplines (pie wedges) in order to provide a solid foundation.
DataEd Slides: Data Management vs. Data StrategyDATAVERSITY
Organizations across most industries make some attempt to utilize Data Management and data strategies. While most organizations have both concepts implemented, they must understand their required interoperability to fully achieve their goals.
Learning Objectives
• Gaining a good understanding of both important topics
• Understanding that data only operates at a very intricate, specifically dependent, intent and what this means
• Understand state-of-the-practice
• Coordination is key, requiring necessary but insufficient interdependencies and sequencing
• Practice makes perfect
RWDG Slides: Data Architecture Is Data GovernanceDATAVERSITY
Data Architecture and Data Governance are the same thing! Aren’t they?
Most people would say that this line of thinking is absurd — or even worse. There is NO WAY that they are the same thing. Or are they?
This RWDG webinar with Bob Seiner and his special guest Anthony Algmin looks at the disciplines of Data Governance and Data Architecture and explores how much they are the same … and how they are different. The speakers will let you draw your own conclusion, but they will get you thinking about whether Data Architecture and Data Governance are two sides of the same coin.
In this webinar, Bob and Anthony will discuss:
• What is meant by the saying two sides of the same coin … and how it relates
• The similarities between Data Architecture and Data Governance
• The differences between the two
• How to use Data Architecture to sell Data Governance … and the other way around
• Deciding if the two disciplines are the same … or different
Getting (Re)Started with Data StewardshipDATAVERSITY
In order to find value in your organization’s data assets, heroic data stewards are tasked with saving the day — every single day! Adhering to the organizational Data Governance (DG) framework, they work to ensure that data is captured right the first time, validated through appropriately automated means, and integrated into business processing. Whether it’s data profiling or in-depth root cause analysis, data stewards ensure the organization’s mission-critical data is reliably coordinated. This program will approach this framework and punctuate important facets of a data steward’s role.
RWDG Slides: Building Data Governance Through Data StewardshipDATAVERSITY
Data stewards play an important role in Data Governance solutions. That is why it is critical that organizations get data stewardship right when setting up their program. The data is governed by people. Some people will even tell you that the discipline should be called people governance.
Bob Seiner has a lot to say on this subject. In this RWDG webinar, Bob shares the reasons why you must build your Data Governance program through the stewardship of the data. There is no governance without formal accountability for data. People become stewards when their relationship to data is formalized. It is the only way.
This webinar will focus on:
• The definition of data stewardship that MUST be adopted
• The critical role stewardship plays in governing data
• What it means to formalize accountability
• Why everybody in the organization is a data steward
• How to build Data Governance through stewardship
Data Systems Integration & Business Value PT. 3: Warehousing Data Blueprint
Certain systems are more data focused than others. Usually their primary focus is on accomplishing integration of disparate data. In these cases, failure is most often attributable to the adoption of a single pillar (silver bullet). The three webinars in the Data Systems Integration and Business Value series are designed to illustrate that good systems development more often depends on at least three DM disciplines (pie wedges) in order to provide a solid foundation.
Integrating data across systems has been a perpetual challenge. Unfortunately, the current technology-focused solutions have not helped IT to improve its dismal project success statistics. Data warehouses, BI implementations, and general analytical efforts achieve the same levels of success as other IT projects – approximately 1/3rd are considered successes when measured against price, schedule, or functionality objectives. The first step is determining the appropriate analysis approach to the data system integration challenge. The second step is understanding the strengths and weaknesses of various approaches. Turns out that proper analysis at this stage makes actual technology selection far more accurate. Only when these are accomplished can proper matching between problem and capabilities be achieved as the third step and true business value be delivered.
Was Big Data worth it? We were promised a data revolution when Big Data and Hadoop exploded onto the scene – but those technologies brought with them ungoverned, underexploited, complex environments that didn’t solve the analytical problems they were supposed to. All is not lost, however. This webcast explores three important things we’ve learned from Big Data that can be applied to every kind of data environment: modern approaches to data that exploit the flexibility and power of Big Data without losing the governance and management our businesses need.
For over four decades, IT strategy has been about the alignment of technology with the needs of the “customer,” be it an organization, business, end user, or device. The most important part of system acquisition is deciding what to build or buy, as it is better to deliver no solution at all than it is to deliver the wrong solution. But there are two distinct dimensions to getting requirements and ensuring that they, and the IT solution that results, not only aligns with the business as it is, but is built in such a way that it can sustain that alignment in a cost-effective and time-efficient manner. Specifically, (1) narrow requirements, which focus on the short-term needs for specific parts, functions, or processes of the business; and, (2) broad requirements, which focus on a comprehensive, enterprise-wide approach with holistic and longer-range objectives like simplicity, suppleness, and total cost of ownership. We typically call these “Systems Analysis and Design” and “Enterprise Architecture” respectively. Ideally, organizations should be able to do both well, and effectively balance the inevitable tradeoffs between them. Sadly, in the vast majority of organizations, that is not yet the case.
Professor Kappelman will present the results of a ground-breaking study from the Society for Information Management (SIM) Enterprise Architecture Working Group that developed and validated measures for these two distinct types of requirements capabilities. Findings include:
• Empirical validation that there is, in fact, a difference between requirement capabilities in a narrow or individual system context (i.e., Systems Analysis and Design within the bounds of a specific development project), and requirements capabilities in a broad or enterprise context (i.e., Enterprise Architecture regarding how those individual systems fit together in an enterprise-wide strategic design).
• Strong evidence that requirements capabilities overall are immature, with narrow activities more mature than the corresponding broad enterprise capabilities.
• Solid evidence, based on fifteen years of studies, that software development capabilities are generally maturing, but are still fairly immature.
This research provides requirements engineers, software designers, software developers, and other IT practitioners with tools to assess their own requirements engineering and software development capabilities. and compare them with those of their peers. Suggestions for improvements are made.
Data-Ed Webinar: Data Modeling FundamentalsDATAVERSITY
Every organization produces and consumes data. Because data is so important to day to day operations, data trends are hitting the mainstream and businesses are adopting buzzwords such as Big Data, NoSQL, data scientist, etc., to seek solutions for their fundamental issues. Few realize that the importance of any solution, regardless of platform or technology, relies on the data model supporting it. Data modeling is not an optional task for an organization’s data effort. It is a vital activity that supports the solutions driving your business.
This webinar will address fundamental data modeling methodologies, as well as trends around the practice of data modeling itself. We will discuss abstract models and entity frameworks, as well as the general shift from data modeling being segmented to becoming more integrated with business practices.
Learning Objectives:
How are anchor modeling, data vault, etc. different and when should I apply them?
Integrating data models to business models and the value this creates
Application development (Data first, code first, object first)
For more training on AWS, visit: https://www.qa.com/amazon
AWS Loft | London - Deep Dive: Amazon DynamoDB by Dean Bryen, Solutions Architect, 18 April 2016
This session will begin with an introduction to non-relational (NoSQL) databases and compare them with relational (SQL) databases. We will also explain the fundamentals of Amazon DynamoDB, a fully managed NoSQL database service. Learn the fundamentals of DynamoDB and see the new DynamoDB console first-hand as we discuss common use cases and benefits of this high-performance key-value and JSON document store.
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...Amazon Web Services
“Attribution" is the marketing term of art for allocating full or partial credit to individual advertisements that eventually lead to a purchase, sign up, download, or other desired consumer interaction. We'll share how we use DynamoDB at the core of our attribution system to store terabytes of advertising history data. The system is cost effective and dynamically scales from 0 to 300K requests per second on demand with predictable performance and low operational overhead.
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)Amazon Web Services
Explore Amazon DynamoDB capabilities and benefits in detail and learn how to get the most out of your DynamoDB database. We go over best practices for schema design with DynamoDB across multiple use cases, including gaming, AdTech, IoT, and others. We explore designing efficient indexes, scanning, and querying, and go into detail on a number of recently released features, including JSON document support, DynamoDB Streams, and more. We also provide lessons learned from operating DynamoDB at scale, including provisioning DynamoDB for IoT.
SQL to NoSQL Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...Amazon Web Services
Applications have traditionally stored data in a relational database management system (RDBMS) and have used a Structured Query Language (SQL) to retrieve and update that data. The growth of “internet scale” apps, such as e-commerce, social media, mobile apps, and the rise of big data have increased data throughput demands beyond the range of traditional relational databases. Non-relational (NoSQL) databases enables your application to scale more cost effectively, even for extraordinarily high demand. Amazon DynamoDB is a fully managed NoSQL database service that lets you focus on your app so you don’t have to worry about hardware acquisition or database management and lets you scale down your costs for off-peak periods. In this webinar, we’ll describe common database tasks, then compare and contrast SQL with equivalent DynamoDB operations.
Learning Objectives:
• Why consider the switch from SQL to NoSQL?
• Benefits of Amazon’s NoSQL database service
• Common SQL database operations and their DynamoDB equivalents
Smart Data Webinar: Advances in Natural Language Processing II - NL GenerationDATAVERSITY
Need more than visualization?
Generate custom narrative docs from data today.
Technology for natural language generation (NLG) has advanced from the production of restricted-domain question-answering and simulation systems to the delivery of general purpose data- or model-driven narratives that are virtually indistinguishable from human-generated correspondence.
From sports to stock reports, you’ve probably read a machine-generated report in the past year without realizing that the “author” was a machine.
Participants in this webinar will learn how modern approaches have progressed beyond pattern matching and table-driven text selection to algorithms that consider context and tone. We will also present examples of commercially available NLG APIs to help participants experiment with NLG in their own applications right away.
Implementing Big Data, NoSQL, & Hadoop - Bigger Is (Usually) BetterDATAVERSITY
From its widespread formal business practice to the scope of casual popular awareness, “Big Data” has a tendency to live up to its name. Featured in countless headlines, journal articles, and industry reviews, Big Data metrics and methods such as NoSQL and Hadoop have taken up plenty of the spotlight as of late. However, most of what has been written about these topics is focused on the hardware, services, and scale-out involved with them, a misguided focus that ignores the critical questions driving any shift in corporate strategy: what can Big Data do for you? Which approach to it best fits your organization? And perhaps most importantly, what is required on your end in order to spur a successful implementation process?
In the interest of answering these and other questions, this webinar will:
Provide guidance on how to think about and establish realistic Big Data management plans and expectations for generating business value, as well as on the means by which big data can complement existing data management practices
Introduce a framework for evaluating the various choices when it comes to implementing and succeeding with Big Data/NoSQL
Elaborate upon the prototyping nature of practicing Big Data techniques
Show how to demonstrate a sample use ca
DataEd Slides: Data Management Best PracticesDATAVERSITY
It is clear that Data Management best practices exist and so does a useful process for improving existing Data Management practices. The question arises: Since we understand the goal, how does one design a process for Data Management goal achievement? This approach combines the DM BoK and the CMMI/DMM, permitting organizations with the opportunity to benefit from the best of both. The approach permits organizations to understand current Data Management practices, strengths to leverage, and remediation opportunities. In a nutshell, it describes what must be done at the programmatic level to achieve better data use.
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
It is an exciting and interesting time to be involved in data. More change of influence has occurred in the database management in the last 18 months than has occurred in the last 18 years. New technologies such as NoSQL & Hadoop and radical redesigns of existing technologies, like NewSQL , will change dramatically how we manage data moving forward.
These technologies bring with them possibilities both in terms of the scale of data retained but also in how this data can be utilized as an information asset. The ability to leverage Big Data to drive deep insights will become a key competitive advantage for many organisations in the future.
Join Tony Bain as he takes us through both the high level drivers for the changes in technology, how these are relevant to the enterprise and an overview of the possibilities a Big Data strategy can start to unlock.
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...IT Network marcus evans
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong Value-Adding Proposition
by Patrick Hadley, Australian Bureau of Statistics at the Australian CIO Summit 2014
Sentara Linked Data Workshop - Sept 10, 20123 Round Stones
One day workshop to Sentara Healthcare on using a Linked Data approach for enterprise architecture. Topics include: Open Government Data initiatives, demo of Weather Health Web application; leveraging open data from NIH, NLM, NOAA, EPA, HHS; Callimachus Enterprise, a Linked Data Management System for the enterprise.
Big Data brings big promise and also big challenges, the primary and most important one being the ability to deliver Value to business stakeholders who are not data scientists!
Data-Ed Webinar: Data Quality Success StoriesDATAVERSITY
Organizations must realize what it means to utilize data quality management in support of business strategy. This webinar will illustrate how organizations with chronic business challenges often can trace the root of the problem to poor data quality. Showing how data quality should be engineered provides a useful framework in which to develop an effective approach. This in turn allows organizations to more quickly identify business problems as well as data problems caused by structural issues versus practice-oriented defects and prevent these from re-occurring.
Takeaways:
•Understanding foundational data quality concepts based on the DAMA DMBOK
•Utilizing data quality engineering in support of business strategy
•Case Studies illustrating data quality success
•Data Quality guiding principles & best practices
•Steps for improving data quality at your organization
Data-Ed Slides: Data-Centric Strategy & Roadmap - Supercharging Your BusinessDATAVERSITY
In many organizations and functional areas, data has pulled even with money in terms of what makes the proverbial world go ‘round. As businesses struggle to cope with the 21st century’s newfound data flood, it is more important than ever before to prioritize data as an asset that directly supports business imperatives. However, while organizations across most industries make some attempt to address data opportunities (e.g. Big Data) and data challenges (e.g. data quality), the results of these efforts frequently fall far below expectations. At the root of many of these failures is poor organizational data management—which fortunately is a remediable problem.
This webinar will cover three lessons, each illustrated with examples, that will help you establish realistic goals and benchmarks for data management processes and communicate their value to both internal and external decision makers:
- How organizational thinking must change to include value-added data management practices
- The importance of walking before you run with data-focused initiatives
- Prioritizing specification and data governance over “silver bullet” analytical tools
Big Data with Hadoop and HDInsight. This is an intro to the technology. If you are new to BigData or just heard of it. This presentation help you to know just little bit more about the technology.
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
Organizations today need a broad set of enterprise data cloud services with key data functionality to modernize applications and utilize machine learning. They need a comprehensive platform designed to address multi-faceted needs by offering multi-function data management and analytics to solve the enterprise’s most pressing data and analytic challenges in a streamlined fashion.
In this research-based session, I’ll discuss what the components are in multiple modern enterprise analytics stacks (i.e., dedicated compute, storage, data integration, streaming, etc.) and focus on total cost of ownership.
A complete machine learning infrastructure cost for the first modern use case at a midsize to large enterprise will be anywhere from $3 million to $22 million. Get this data point as you take the next steps on your journey into the highest spend and return item for most companies in the next several years.
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
Do you ever wonder how data-driven organizations fuel analytics, improve customer experience, and accelerate business productivity? They are successful by governing and mastering data effectively so they can get trusted data to those who need it faster. Efficient data discovery, mastering and democratization is critical for swiftly linking accurate data with business consumers. When business teams can quickly and easily locate, interpret, trust, and apply data assets to support sound business judgment, it takes less time to see value.
Join data mastering and data governance experts from Informatica—plus a real-world organization empowering trusted data for analytics—for a lively panel discussion. You’ll hear more about how a single cloud-native approach can help global businesses in any economy create more value—faster, more reliably, and with more confidence—by making data management and governance easier to implement.
What is data literacy? Which organizations, and which workers in those organizations, need to be data-literate? There are seemingly hundreds of definitions of data literacy, along with almost as many opinions about how to achieve it.
In a broader perspective, companies must consider whether data literacy is an isolated goal or one component of a broader learning strategy to address skill deficits. How does data literacy compare to other types of skills or “literacy” such as business acumen?
This session will position data literacy in the context of other worker skills as a framework for understanding how and where it fits and how to advocate for its importance.
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
Developing a Data Strategy for your organization can seem like a daunting task – but it’s worth the effort. Getting your Data Strategy right can provide significant value, as data drives many of the key initiatives in today’s marketplace – from digital transformation, to marketing, to customer centricity, to population health, and more. This webinar will help demystify Data Strategy and its relationship to Data Architecture and will provide concrete, practical ways to get started.
Uncover how your business can save money and find new revenue streams.
Driving profitability is a top priority for companies globally, especially in uncertain economic times. It's imperative that companies reimagine growth strategies and improve process efficiencies to help cut costs and drive revenue – but how?
By leveraging data-driven strategies layered with artificial intelligence, companies can achieve untapped potential and help their businesses save money and drive profitability.
In this webinar, you'll learn:
- How your company can leverage data and AI to reduce spending and costs
- Ways you can monetize data and AI and uncover new growth strategies
- How different companies have implemented these strategies to achieve cost optimization benefits
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
Organizations with governed metadata made available through their data catalog can answer questions their people have about the organization’s data. These organizations get more value from their data, protect their data better, gain improved ROI from data-centric projects and programs, and have more confidence in their most strategic data.
Join Bob Seiner for this lively webinar where he will talk about the value of a data catalog and how to build the use of the catalog into your stewards’ daily routines. Bob will share how the tool must be positioned for success and viewed as a must-have resource that is a steppingstone and catalyst to governed data across the organization.
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
Organizations with governed metadata made available through their data catalog can answer questions their people have about the organization’s data. These organizations get more value from their data, protect their data better, gain improved ROI from data-centric projects and programs, and have more confidence in their most strategic data.
Join Bob Seiner for this lively webinar where he will talk about the value of a data catalog and how to build the use of the catalog into your stewards’ daily routines. Bob will share how the tool must be positioned for success and viewed as a must-have resource that is a steppingstone and catalyst to governed data across the organization.
In this webinar, Bob will focus on:
-Selecting the appropriate metadata to govern
-The business and technical value of a data catalog
-Building the catalog into people’s routines
-Positioning the data catalog for success
-Questions the data catalog can answer
Because every organization produces and propagates data as part of their day-to-day operations, data trends are becoming more and more important in the mainstream business world’s consciousness. For many organizations in various industries, though, comprehension of this development begins and ends with buzzwords: “Big Data,” “NoSQL,” “Data Scientist,” and so on. Few realize that all solutions to their business problems, regardless of platform or relevant technology, rely to a critical extent on the data model supporting them. As such, data modeling is not an optional task for an organization’s data effort, but rather a vital activity that facilitates the solutions driving your business. Since quality engineering/architecture work products do not happen accidentally, the more your organization depends on automation, the more important the data models driving the engineering and architecture activities of your organization. This webinar illustrates data modeling as a key activity upon which so much technology and business investment depends.
Specific learning objectives include:
- Understanding what types of challenges require data modeling to be part of the solution
- How automation requires standardization on derivable via data modeling techniques
- Why only a working partnership between data and the business can produce useful outcomes
Analytics play a critical role in supporting strategic business initiatives. Despite the obvious value to analytic professionals of providing the analytics for these initiatives, many executives question the economic return of analytics as well as data lakes, machine learning, master data management, and the like.
Technology professionals need to calculate and present business value in terms business executives can understand. Unfortunately, most IT professionals lack the knowledge required to develop comprehensive cost-benefit analyses and return on investment (ROI) measurements.
This session provides a framework to help technology professionals research, measure, and present the economic value of a proposed or existing analytics initiative, no matter the form that the business benefit arises. The session will provide practical advice about how to calculate ROI and the formulas, and how to collect the necessary information.
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
Data Mesh is a trending approach to building a decentralized data architecture by leveraging a domain-oriented, self-service design. However, the pure definition of Data Mesh lacks a center of excellence or central data team and doesn’t address the need for a common approach for sharing data products across teams. The semantic layer is emerging as a key component to supporting a Hub and Spoke style of organizing data teams by introducing data model sharing, collaboration, and distributed ownership controls.
This session will explain how data teams can define common models and definitions with a semantic layer to decentralize analytics product creation using a Hub and Spoke architecture.
Attend this session to learn about:
- The role of a Data Mesh in the modern cloud architecture.
- How a semantic layer can serve as the binding agent to support decentralization.
- How to drive self service with consistency and control.
Enterprise data literacy. A worthy objective? Certainly! A realistic goal? That remains to be seen. As companies consider investing in data literacy education, questions arise about its value and purpose. While the destination – having a data-fluent workforce – is attractive, we wonder how (and if) we can get there.
Kicking off this webinar series, we begin with a panel discussion to explore the landscape of literacy, including expert positions and results from focus groups:
- why it matters,
- what it means,
- what gets in the way,
- who needs it (and how much they need),
- what companies believe it will accomplish.
In this engaging discussion about literacy, we will set the stage for future webinars to answer specific questions and feature successful literacy efforts.
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
Change is hard, especially in response to negative stimuli or what is perceived as negative stimuli. So organizations need to reframe how they think about data privacy, security and governance, treating them as value centers to 1) ensure enterprise data can flow where it needs to, 2) prevent – not just react – to internal and external threats, and 3) comply with data privacy and security regulations.
Working together, these roles can accelerate faster access to approved, relevant and higher quality data – and that means more successful use cases, faster speed to insights, and better business outcomes. However, both new information and tools are required to make the shift from defense to offense, reducing data drama while increasing its value.
Join us for this panel discussion with experts in these fields as they discuss:
- Recent research about where data privacy, security and governance stand
- The most valuable enterprise data use cases
- The common obstacles to data value creation
- New approaches to data privacy, security and governance
- Their advice on how to shift from a reactive to resilient mindset/culture/organization
You’ll be educated, entertained and inspired by this panel and their expertise in using the data trifecta to innovate more often, operate more efficiently, and differentiate more strategically.
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
With technological innovation and change occurring at an ever-increasing rate, it’s hard to keep track of what’s hype and what can provide practical value for your organization. Join this webinar to see the results of a recent DATAVERSITY survey on emerging trends in Data Architecture, along with practical commentary and advice from industry expert Donna Burbank.
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
As DATAVERSITY’s RWDG series hurdles into our 12th year, this webinar takes a quick look behind us, evaluates the present, and predicts the future of Data Governance. Based on webinar numbers, hot Data Governance topics have evolved over the years from policies and best practices, roles and tools, data catalogs and frameworks, to supporting data mesh and fabric, artificial intelligence, virtualization, literacy, and metadata governance.
Join Bob Seiner as he reflects on the past and what has and has not worked, while sharing examples of enterprise successes and struggles. In this webinar, Bob will challenge the audience to stay a step ahead by learning from the past and blazing a new trail into the future of Data Governance.
In this webinar, Bob will focus on:
- Data Governance’s past, present, and future
- How trials and tribulations evolve to success
- Leveraging lessons learned to improve productivity
- The great Data Governance tool explosion
- The future of Data Governance
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
Would you share your bank account information on social media? How about shouting your social security number on the New York City subway? We didn’t think so either – that’s why data governance is consistently top of mind.
In this webinar, we’ll discuss the common Cloud data governance best practices – and how to apply them today. Join us to uncover Google Cloud’s investment in data governance and learn practical and doable methods around key management and confidential computing. Hear real customer experiences and leave with insights that you can share with your team. Let’s get solving.
Topics that you will hear addressed in this webinar:
- Understanding the basics of Cloud Incident Response (IR) and anticipated data governance trends
- Best practices for key management and apply data governance to your day-to-day
- The next wave of Confidential Computing and how to get started, including a demo
It is a fascinating, explosive time for enterprise analytics.
It is from the position of analytics leadership that the enterprise mission will be executed and company leadership will emerge. The data professional is absolutely sitting on the performance of the company in this information economy and has an obligation to demonstrate the possibilities and originate the architecture, data, and projects that will deliver analytics. After all, no matter what business you’re in, you’re in the business of analytics.
The coming years will be full of big changes in enterprise analytics and data architecture. William will kick off the fifth year of the Advanced Analytics series with a discussion of the trends winning organizations should build into their plans, expectations, vision, and awareness now.
Too often I hear the question “Can you help me with our data strategy?” Unfortunately, for most, this is the wrong request because it focuses on the least valuable component: the data strategy itself. A more useful request is: “Can you help me apply data strategically?” Yes, at early maturity phases the process of developing strategic thinking about data is more important than the actual product! Trying to write a good (must less perfect) data strategy on the first attempt is generally not productive –particularly given the widespread acceptance of Mike Tyson’s truism: “Everybody has a plan until they get punched in the face.” This program refocuses efforts on learning how to iteratively improve the way data is strategically applied. This will permit data-based strategy components to keep up with agile, evolving organizational strategies. It also contributes to three primary organizational data goals. Learn how to improve the following:
- Your organization’s data
- The way your people use data
- The way your people use data to achieve your organizational strategy
This will help in ways never imagined. Data are your sole non-depletable, non-degradable, durable strategic assets, and they are pervasively shared across every organizational area. Addressing existing challenges programmatically includes overcoming necessary but insufficient prerequisites and developing a disciplined, repeatable means of improving business objectives. This process (based on the theory of constraints) is where the strategic data work really occurs as organizations identify prioritized areas where better assets, literacy, and support (data strategy components) can help an organization better achieve specific strategic objectives. Then the process becomes lather, rinse, and repeat. Several complementary concepts are also covered, including:
- A cohesive argument for why data strategy is necessary for effective data governance
- An overview of prerequisites for effective strategic use of data strategy, as well as common pitfalls
- A repeatable process for identifying and removing data constraints
- The importance of balancing business operation and innovation
Who Should Own Data Governance – IT or Business?DATAVERSITY
The question is asked all the time: “What part of the organization should own your Data Governance program?” The typical answers are “the business” and “IT (information technology).” Another answer to that question is “Yes.” The program must be owned and reside somewhere in the organization. You may ask yourself if there is a correct answer to the question.
Join this new RWDG webinar with Bob Seiner where Bob will answer the question that is the title of this webinar. Determining ownership of Data Governance is a vital first step. Figuring out the appropriate part of the organization to manage the program is an important second step. This webinar will help you address these questions and more.
In this session Bob will share:
- What is meant by “the business” when it comes to owning Data Governance
- Why some people say that Data Governance in IT is destined to fail
- Examples of IT positioned Data Governance success
- Considerations for answering the question in your organization
- The final answer to the question of who should own Data Governance
It is clear that Data Management best practices exist and so does a useful process for improving existing Data Management practices. The question arises: Since we understand the goal, how does one design a process for Data Management goal achievement? This program describes what must be done at the programmatic level to achieve better data use and a way to implement this as part of your data program. The approach combines DMBoK content and CMMI/DMM processes – permitting organizations with the opportunity to benefit from the best of both. It also permits organizations to understand:
- Their current Data Management practices
- Strengths that should be leveraged
- Remediation opportunities
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
MLOps is a practice for collaboration between Data Science and operations to manage the production machine learning (ML) lifecycles. As an amalgamation of “machine learning” and “operations,” MLOps applies DevOps principles to ML delivery, enabling the delivery of ML-based innovation at scale to result in:
Faster time to market of ML-based solutions
More rapid rate of experimentation, driving innovation
Assurance of quality, trustworthiness, and ethical AI
MLOps is essential for scaling ML. Without it, enterprises risk struggling with costly overhead and stalled progress. Several vendors have emerged with offerings to support MLOps: the major offerings are Microsoft Azure ML and Google Vertex AI. We looked at these offerings from the perspective of enterprise features and time-to-value.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Welocme to ViralQR, your best QR code generator.ViralQR
Welcome to ViralQR, your best QR code generator available on the market!
At ViralQR, we design static and dynamic QR codes. Our mission is to make business operations easier and customer engagement more powerful through the use of QR technology. Be it a small-scale business or a huge enterprise, our easy-to-use platform provides multiple choices that can be tailored according to your company's branding and marketing strategies.
Our Vision
We are here to make the process of creating QR codes easy and smooth, thus enhancing customer interaction and making business more fluid. We very strongly believe in the ability of QR codes to change the world for businesses in their interaction with customers and are set on making that technology accessible and usable far and wide.
Our Achievements
Ever since its inception, we have successfully served many clients by offering QR codes in their marketing, service delivery, and collection of feedback across various industries. Our platform has been recognized for its ease of use and amazing features, which helped a business to make QR codes.
Our Services
At ViralQR, here is a comprehensive suite of services that caters to your very needs:
Static QR Codes: Create free static QR codes. These QR codes are able to store significant information such as URLs, vCards, plain text, emails and SMS, Wi-Fi credentials, and Bitcoin addresses.
Dynamic QR codes: These also have all the advanced features but are subscription-based. They can directly link to PDF files, images, micro-landing pages, social accounts, review forms, business pages, and applications. In addition, they can be branded with CTAs, frames, patterns, colors, and logos to enhance your branding.
Pricing and Packages
Additionally, there is a 14-day free offer to ViralQR, which is an exceptional opportunity for new users to take a feel of this platform. One can easily subscribe from there and experience the full dynamic of using QR codes. The subscription plans are not only meant for business; they are priced very flexibly so that literally every business could afford to benefit from our service.
Why choose us?
ViralQR will provide services for marketing, advertising, catering, retail, and the like. The QR codes can be posted on fliers, packaging, merchandise, and banners, as well as to substitute for cash and cards in a restaurant or coffee shop. With QR codes integrated into your business, improve customer engagement and streamline operations.
Comprehensive Analytics
Subscribers of ViralQR receive detailed analytics and tracking tools in light of having a view of the core values of QR code performance. Our analytics dashboard shows aggregate views and unique views, as well as detailed information about each impression, including time, device, browser, and estimated location by city and country.
So, thank you for choosing ViralQR; we have an offer of nothing but the best in terms of QR code services to meet business diversity!
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Securing your Kubernetes cluster_ a step-by-step guide to success !
Data-Ed Webinar: A Framework for Implementing NoSQL, Hadoop
1. • Big Data
could
know us
better than
we know
ourselves
– Dan
Gardner
• We'll see this as the
time in history wh
the world's
information was
transformed from
inert, passive stat
and put into a
unified system th
brings that
information alive
– Michael Nielsen
ow have a
ce to en
me the
of our
nowledge
rse, one an
onstantly e,
figures
to match
eeds
hael S.
one
at
A Framework for Implementing
NoSQL, Hadoop
• N • Today a street stall in Mumbai can access more
b information, maps, statistics, academic papers, price
n trends, futures markets, and data than a U.S.
c President could only a few decades ago
– – Juan Enriquez
ot everything that can
e counted counts, and
ot everything that
ounts can be counted
Albert Einstein
Big Data and NoSQL continue to make headlines everywhere.
However, most of what has been written about these topics is
focused on the hardware, services, and scale out. But what about
a Big Data and NoSQL Strategy, one that supports your business
strategy? Virtually every major organization thinking about these
data platforms is faced with the challenge of figuring out the
appropriate approach and the requirements. This presentation will
provide guidance on how to think about and establish realistic Big
Data management plans and expectations. We will introduce a
framework for evaluating the various choices when it comes to
implementing and succeeding with Big Data/NoSQL and show
how to demonstrate a sample use case.
Takeaways:
• A Framework for evaluating Big Data techniques
• Deciding on a Big Data platform – How do you know which one
is a good fit for you?
• The means by which big data techniques can complement
existing data management practices
• The prototyping nature of practicing big data techniques
• The distinct ways in which utilizing Big Data can generate
business value
Date:
Time:
Presenter:
June 9, 2015
2:00 PM ET/11:00AM PT
PeterAiken, Ph.D. & Josh Bartels
• Soon we will salt the oceans, the land, and the sk
with uncounted numbers of sensors invisible to th
eyes but visible to one another
• We n – Esther Dyson
chan
beco
center
own k
unive
that c
recon
itself
our n
– Mic
Mal
• We've reached a tipping point in history: today more y
data is being manufactured by machines, servers, e
and cell phones, than by people
– Michael E. Driscoll
• Every century, a new technology-steam power,
electricity, atomic energy, or microprocessors-has
swept away the old world with a vision of a new one.
Today, we seem to be entering the era of Big Data
– Michael Coren
1Copyright 2015 by Data Blueprint Slide #
3. Steven MacLauchlan
• 10 years of experience in Application
Development and Data Modeling with a
focus on Healthcare solutions.
• Delivers tailored data management
solutions that provide focus on data’s
business value while enhancing clients’
overall capability to manage data
• Certified Data Management Professional (CDMP)
• Computer Science degree from Virginia Commonwealth
University
• Most recent focus: Understanding emerging
data modeling trends and how these can
best be leveraged for the Enterprise.
3Copyright 2015 by Data Blueprint Slide #
4. Get Social With Us!
Live Twitter Feed
Join the conversation!
Follow us:
@datablueprint
@paiken
Ask questions and submit
your comments: #dataed
Like Us on Facebook
www.facebook.com/
datablueprint
Post questions and comments
Find industry news, insightful
content
and event updates.
Join the Group
Data Management &
Business Intelligence
Ask questions, gain insights
and collaborate with fellow
data management
professionals
4Copyright 2015 by Data Blueprint Slide #
5. Peter Aiken, Ph.D.
• 30+ years in data management
• Repeated international recognition
• Founder, Data Blueprint (datablueprint.com)
• Associate Professor of IS (vcu.edu)
• DAMA International (dama.org)
• 9 books and dozens of articles
• Experienced w/ 500+ data
management practices
• Multi-year immersions:
– US DoD
– Nokia
– Deutsche Bank
– Wells Fargo
– Walmart
– …
• DAMA International President 2009-2013
• DAMA International Achievement Award 2001 (with
Dr. E. F. "Ted" Codd
• DAMA International Community Award 2005
PETERAIKEN WITH JUANITABILLINGS
F OR EW O RD B Y J O H N B OTTEGA
MONETIZING
DATA M AN AGEM EN T
Unlocking the Value in Your Organization’s
Most Important Asset.
TheCaseforthe
Chief ta fficer
Recasting uite erage
Your Most aluable A
Peter Aikenand
Michael Gorman
5Copyright 2015 by Data Blueprint Slide #
6. Josh Bartels
• Data management consultant and
leader
– Over (10) years of experience
– Multiple industries (Finance, Defense,
Insurance)
• Certifications
– Certified Data Management
Professional (CDMP)
– Project Manager (PMP)
– Data Vault 2.0 Practitioner (CDVP2)
• Education
– Masters in Business Administration
– Masters in Information Systems
• Current Efforts
– focus on the creation and migration to
new data platforms for clients in the
financial and insurance industries.
6Copyright 2015 by Data Blueprint Slide #
7. Presented by Peter Aiken, Ph.D., Josh Bartels, Steven MacLauchlan
A Framework for Implementing
NoSQL, Hadoop
Demystifying Big Data 2.0: Developing the Right
Approach for Implementing Big Data Techniques
7Copyright 2015 by Data Blueprint Slide #
8. A Framework for Implementing NoSQL, Hadoop
Demystifying Big Data 2.0: Developing the Right Approach for Implementing Big Data Techniques
• Big Data Context
– We are using the wrong vocabulary to discuss this topic
• More Precise Definitions
– Framework
– Non Von Neuman Architectures
– Hadoop/Nosql
• Big Data
– Historical Perspective
• Big Data Approach
– Crawl, Walk, Run
• Framework Examples
– Social
– Operational BWB
• Take Aways and Q&A
Tweeting now at: #dataed
8Copyright 2015 by Data Blueprint Slide #
9. A Framework for Implementing NoSQL, Hadoop
Demystifying Big Data 2.0: Developing the Right Approach for Implementing Big Data Techniques
• Big Data Context
– We are using the wrong vocabulary to discuss this topic
• More Precise Definitions
– Framework
– Non Von Neuman Architectures
– Hadoop/Nosql
• Big Data
– Historical Perspective
• Big Data Approach
– Crawl, Walk, Run
• Framework Examples
– Social
– Operational BWB
• Take Aways and Q&A
Tweeting now at: #dataed
10Copyright 2015 by Data Blueprint Slide #
10. Myth #1: Big Data has a clear definition
Fact:
• The term is used so often
and in many contexts that
its meaning has become
vague and ambiguous
• Industry experts and
scientists often disagree
http://articles.washingtonpost.com/2013-08-16/opinions/41416362_1_big-data-data-crunching-marketing-analytics
10Copyright 2015 by Data Blueprint Slide #
11. Big Data(has something to do with Vs - doesn't it?)
• Volume
– Amount of data
• Velocity
– Speed of data in and out
• Variety
– Range of data types and sources
• 2001 Doug Laney
• Variability
– Many options or variable interpretations confound analysis
• 2011 ISRC
•Vitality
–A dynamically changing Big Data environment in which analysis and predictive models
must continually be updated as changes occur to seize opportunities as they arrive
• 2011 CIA
•Virtual
– Scoping the discussion to only include online assets
• 2012 Courtney Lambert
• Value/Veracity
• Stuart Madnick (John Norris Maguire Professor of Information Technology, MIT Sloan School of
Management & Professor of Engineering Systems, MIT School of Engineering)
11Copyright 2015 by Data Blueprint Slide #
12. Defining Big Data
• Big Data are high-volume, high-velocity, and/or high-variety
information assets that require new forms of processing to
enable enhanced decision making, insight discovery
and process optimization.
– Gartner 2012
• Big data refers to datasets whose size is beyond the ability of
typical database software tools to capture, store, manage, and analyze.
– IBM 2012
• An all-encompassing term for any collection of data sets so large and complex that it
becomes difficult to process using on-hand data management tools or traditional data
processing applications
– Wikipedia 2014
• Shorthand for advancing trends in technology that open the door to a new approach
to understanding the world and making decisions.
– NY Times 2012
• The broad range of new and massive data types that have appeared over the last
decade
– Tom Davenport 2014
• Data of a very large size, typically to the extent that its manipulation and management
present significant logistical challenges.”
– Oxford English Dictionary 2014
• Big data is about putting the "I" back into IT.
– PeterAiken 2007
12Copyright 2015 by Data Blueprint Slide #
13. Big Data Techniques
• New techniques available to impact the productivity (order of
magnitude) of any analytical insight cycle that compliment,
enhance, or replace conventional (existing) analysis methods
• Big data techniques are currently characterized by:
– Continuous, instantaneously
available data sources
– Non-von Neumann
Processing (defined later in the presentation)
– Capabilities approaching
or past human comprehension
– Architecturally enhanceable
identity/security capabilities
– Other tradeoff-focused data processing
• So a good question becomes "where in our existing architecture
can we most effectively apply Big Data Techniques?"
13Copyright 2015 by Data Blueprint Slide #
14. Big Data Technologies by themselves, are a One Legged Stool
Governance is the major means
of preventing over reliance on
one legged stools!
14Copyright 2015 by Data Blueprint Slide #
15. The Big Data Landscape
Copyright Dave Feinleib, bigdatalandscape.com
15Copyright 2015 by Data Blueprint Slide #
18. Myth #2: Everyone should invest in Big Data
Fact:
• Not every company will
benefit from Big Data
• It depends on your size
and your ability
– Local pizza shop vs.
state-wide or national
chain
18Copyright 2015 by Data Blueprint Slide #
19. Big Data can create significant financial value across sectors
• Some (not all)
companies can
take advantage
of Big Data to
create value if
they want to
compete
20Copyright 2015 by Data Blueprint Slide #
20. A Framework for Implementing NoSQL, Hadoop
Demystifying Big Data 2.0: Developing the Right Approach for Implementing Big Data Techniques
• Big Data Context
– We are using the wrong vocabulary to discuss this topic
• More Precise Definitions
– Framework
– Non Von Neuman Architectures
– Hadoop/Nosql
• Big Data
– Historical Perspective
• Big Data Approach
– Crawl, Walk, Run
• Framework Examples
– Social
– Operational BWB
• Take Aways and Q&A
Tweeting now at: #dataed
20Copyright 2015 by Data Blueprint Slide #
21. Big Data = Big Spending
• Enterprises are spending wildly on Big Data but don’t
know if it’s worth it yet (Business Insider, 2012)
• Big Data Technology Spending Trend:
• 83% increase over the next 3 years (worldwide):
– 2012: $28 billion
– 2013: $34 billion
– 2016: $232 billion
• Caution:
– Don’t fall victim to SOS (Shiny Object
Syndrome)
– A lot of money is being invested but
is it generating the expected return?
– Gartner Hype Cycle suggests results
are going to be disappointing http://www.businessinsider.com/enterprise-big-data-spending-2012-11#ixzz2cdT8shhe
http://www.inc.com/kathleen-kim/big-data-spending-to-increase-for-it-industry.html
http://www.gartner.com/DisplayDocument?id=2195915&ref=clientFriendlyUrl
21Copyright 2015 by Data Blueprint Slide #
22. Who wrote this … ?
23
Copyright 2015 by Data Blueprint
• In considering any new
subject, there is
frequently a tendency
first to overrate what
we find to be already
interesting or
remarkable, and
secondly - by a sort of
natural reaction - to
undervalue the true
state of the case.
• AugustaAda King,
Countess of Lovelace - aka
Ada Lovelace, publisher of
the first computing program
23. Gartner Five-phase Hype Cycle
http://www.gartner.com/technology/research/methodologies/hype-cycle.jsp
Peak of Inflated Expectations: Early publicity produces a number of
success stories—often accompanied by scores of failures. Some
companies take action; many do not.
Trough of Disillusionment: Interest wanes as experiments and implementations fail to deliver. Producers of the
technology shake out or fail. Investments continue only if the surviving providers improve their products to the
satisfaction of early adopters.
Technology Trigger: A potential technology breakthrough kicks things off. Early proof-of-concept stories and media interest
trigger significant publicity. Often no usable products exist and commercial viability is unproven.
Slope of Enlightenment: More instances of how the technology can benefit the
enterprise start to crystallize and become more widely understood. Second- and third-
generation products appear from technology providers. More enterprises fund pilots;
conservative companies remain cautious.
Plateau of Productivity: Mainstream adoption starts to
take off. Criteria for assessing provider viability are more
clearly defined. The technology’s broad market
applicability and relevance are clearly paying off.
23Copyright 2015 by Data Blueprint Slide #
24. Gartner Hype Cycle
"A focus on big data is not a substitute for the
fundamentals of information management."
24Copyright 2015 by Data Blueprint Slide #
25. 2012 Big Data in Gartner’s Hype Cycle
25Copyright 2015 by Data Blueprint Slide #
26. 2013 Big Data in Gartner’s Hype Cycle
26Copyright 2015 by Data Blueprint Slide #
27. 2014 Big Data in Gartner’s Hype Cycle
27Copyright 2015 by Data Blueprint Slide #
28. Big Data Gartner Hype Cycle
Copyright 2015 by Data Blueprint Slide #
29
29. Myth #3: Big Data is innovative
Fact:
• Big Data techniques are
innovative
• ROI and insights depend
on the size of the business
and the amount of data
used and produced, e.g.
– Local pizza place vs. Papa
John’s
– Retail
29Copyright 2015 by Data Blueprint Slide #
30. My Barn must pass a foundation inspection
• Before further construction can proceed
• No IT equivalent in most organizations
30Copyright 2015 by Data Blueprint Slide #
31. Frameworks
• A system of ideas
for guiding
analyses
• A means of
organizing project
data
• Data integration
priorities decision
making
framework
• A means of
assessing
progress
8 31Copyright 2015 by Data Blueprint Slide #
32. "There’s now a blurring between the storage world and the memory world"
• Faster processors outstripped
not only the hard disk, but main
memory
– Hard disk too slow
– Memory too small
• Flash drives remove both
bottlenecks
– Combined Apple and Yahoo have
spend more than $500 million to
date
• Make it look like traditional
storage or more system
memory
– Minimum 10x improvements
– Dragonstone server is 3.2 tb flash
memory (Facebook)
• Bottom line - new capabilities!
8 32Copyright 2015 by Data Blueprint Slide #
33. Non-von Neumann Processing/Efficiencies
• von Neumann
bottleneck
(computer science)
– "An inefficiency inherent in
the design of any von
Neumann machine that
arises from the fact that
most computer time is
spent in moving
information between
storage and the central
processing unit rather than
operating on it"
[http://encyclopedia2.thefreedictionary.com/von+Neumann+bottleneck]
• Michael Stonebraker
– Ingres (Berkeley/MIT)
– Modern database
processing is
approximately 4%
efficient
• Many big data
architectures are
attempts to address
this, but:
– Zero sum game
– Trade characteristics
against each other
• Reliability
• Predictability
– Google/MapReduce/
Bigtable
– Amazon/Dynamo
– Netflix/Chaos Monkey
– Hadoop
– McDipper
• Big data techniques
exploit non-von
Neumann processing
8 33Copyright 2015 by Data Blueprint Slide #
35. One of Data Blueprint's Big Data Clusters
8 35Copyright 2015 by Data Blueprint Slide #
36. <-Feedback
Exploitable
Insight
• Patterns/objects,
hypotheses emerge
– What can be observed?
• Operationalizing
– The dots can be
repeatedly connected
Analytics Insight Cycle
Exis&ng
Knowledge
/base
• Things are happening
– Sensemaking
techniques address
"what" is happening?
• Patterns/objects,
hypotheses emerge
– What can be observed?
• Operationalizing
– The dots can be
repeatedly connected
– "Big Data" contributions
are shown in orange
• Margaret Boden's
computational
creativity
– Exploratory
– Combinational
– Transformational
Volume
Variety
Velocity
Potential/
actual
insights
Pattern/Object
Emergence
Analytical
bottleneck
8 36Copyright 2015 by Data Blueprint Slide #
37. Big Data: Two prominent use cases
• Sandwich offers a good analogy
of the big data and existing
technologies
• Landing Zone (less expensive)
– Especially useful in cases were data
is highly disposable
• Existing technologies are the
– Contents sandwiched and
complemented landing zone and
archival capabilities
• Archiving/Offloading (less need
for structure)
– "Cold" transactional and analytic
data
Adapted from Nancy Kopp:
http://ibmdatamag.com/2013/08/relishing-the-big-data-burger/
Landing Zone
Archiving Offloading
Existing
Data Architectural
Processing
8 37Copyright 2015 by Data Blueprint Slide #
38. What is NoSQL?
• Commonly interpreted as "Not Only SQL
• Broad class of database management technologies that
provide a mechanism for storage and retrieval of data that
doesn’t follow traditional relational database methodology.
• Motivations
– Simplicity of design
– Horizontal scaling
– Finer control over availability of the data.
• The data structures used by NoSQL databases differ from
those used in relational databases, making some
operations faster in NoSQL
and others faster in relational
databases.
8 38Copyright 2015 by Data Blueprint Slide #
39. What is Hadoop?
• A data storage and processing
system, that runs on clusters of commodity servers.
• Able to store any kind of data in its native format.
• Perform a wide variety of analyses and transformations.
• Store terabytes, and even petabytes, of data
inexpensively.
• Handles hardware and system failures automatically,
without losing data or interrupting data analyses.
• Critical components of Hadoop:
– HDFS- The Hadoop Distributed File System is the storage system
for a Hadoop cluster, responsible for distribution of data across the
servers.
– Mapreduce- The inner workings of Hadoop that allows for distributed
and parallel analytical job execution.
40Copyright 2015 by Data Blueprint Slide #
40. Why NoSQL? Why Hadoop?
• Large number of users (read: the internet)
• Rapid app development and deployment
• Large number of mission critical writes (sensors/etc)
• Small, continuous reads and writes, especially where
“Consistency” is less important (social networks)
• Hadoop solves the hard scaling problems caused by large
amounts of complex data.
• As the amount of data in a cluster grows,
new servers can be added to a Hadoop
cluster incrementally and inexpensively
to store and analyze it.
40Copyright 2015 by Data Blueprint Slide #
41. Hadoop Use Cases in the Real World
• Risk Modeling
• Customer Churn Analysis
• Recommendation Engine
• Ad Targeting
• Point of Sale Transaction Analysis
• Social Sentiment on Social Media
• Analyzing network data to predict failure
• Threat analysis
• Trade Surveillance
41Copyright 2015 by Data Blueprint Slide #
43. 44
Copyright 2015 by Data Blueprint
• Data analysis struggles with the social
– Your brain is excellent at social cognition - people can
• Mirror each other’s emotional states
• Detect uncooperative behavior
• Assign value to things through emotion
– Data analysis measures the quantity of social
interactions but not the quality
• Map interactions with co-workers you see during work days
• Can't capture devotion to childhood friends seen annually
– When making (personal) decisions about social
relationships, it’s foolish to swap the amazing machine
in your skull for the crude machine on your desk
• Data struggles with context
– Decisions are embedded in sequences and contexts
– Brains think in stories - weaving together multiple
causes and multiple contexts
– Data analysis is pretty bad at
• Narratives / Emergent thinking / Explaining
• Data creates bigger haystacks
– More data leads to more statistically significant
correlations
– Most are spurious and deceive us
– Falsity grows exponentially greater amounts of data
we collect
• Big data has trouble with big problems
– For example: the economic stimulus debate
– No one has been persuaded by data to switch sides
• Data favors memes over masterpieces
– Detect when large numbers of people take an instant
liking to some cultural product
– Products are hated initially because they are unfamiliar
• Data obscures values
– Data is never raw; it’s always structured according to
somebody’s predispositions and values
Some Big Data Limitations
44. Myth #4: Big Data is just another IT project
Copyright 2013 by Data Blueprint
Fact:
• Big Data is not your typical IT
project
– Does not answer typical IT questions
– Trend analysis, agile, actionable, etc.
– Fundamentally different approach
• Big Data Projects are exploratory
• Big Data enables new capabilities
• Big Data can be a disruptive
technology
• It might sound simple but that
doesn’t mean it’s easy
• Beware of SOS (Shiny Object
Syndrome)
44
48. ("Whereas of the Plague")
Plague Peak
When is it happening?
Copyright 2015 by Data Blueprint
48
49. Black Rats or Rattus Rattus
Why is it happening?
50
Copyright 2015 by Data Blueprint
50. What Will Happen? What will happen?
51
Copyright 2015 by Data Blueprint
51. Formalizing Data Management
• Defend the Realm:
The authorized history of MI5
by Christopher Andrew
• World War I
• 1914
• At war with much
of Europe
• 14,000,000 Germans living
in the United Kingdom
• How to efficiently and
effectively manage
information on that many
individuals?
• The Security Service is responsible for "protecting
the UK against threats to national security from
espionage, terrorism and sabotage, from the activities
of agents of foreign powers, and from actions intended
to overthrow or undermine parliamentary democracy by
political, industrial or violent means."
51Copyright 2015 by Data Blueprint Slide #
52. “As a final thought, how about a machine that
would send, via closed-circuit television, visual and
oral information needed immediately at high-level
conferences or briefings? Let’s say that a group of
senior officers are contemplating a covert action
program for Afghanistan. Things go well until
someone asks “Well, just how many schools are
there in the country, and what is the literacy rate?”
No one in the room knows. (Remember, this is an
imaginary situation). So the junior member present
dials a code number into a device at one end of the
table. Thirty seconds later, on the screen overhead,
a teletype printer begins to hammer out the
required data. Before the meeting is over, the group
has been given, through the same method, the
names of countries that have airlines into
Afghanistan, a biographical profile of the Soviet
ambassador there, and the Pakistani order of battle
along the Afghanistan frontier. Neat, no?”
• Predicted use of
not just
computing in the
intelligence
community
• Also forecast
predictive
analytics
• Accompanying
privacy
challenges
52Copyright 2015 by Data Blueprint Slide #
53. A Framework for Implementing NoSQL, Hadoop
Demystifying Big Data 2.0: Developing the Right Approach for Implementing Big Data Techniques
• Big Data Context
– We are using the wrong vocabulary to discuss this topic
• More Precise Definitions
– Framework
– Non Von Neuman Architectures
– Hadoop/Nosql
• Big Data
– Historical Perspective
• Big Data Approach
– Crawl, Walk, Run
• Framework Examples
– Social
– Operational BWB
• Take Aways and Q&A
Tweeting now at: #dataed
53Copyright 2015 by Data Blueprint Slide #
54. http://articles.washingtonpost.com/2013-08-16/opinions/41416362_1_big-data-data-crunching-marketing-analytics
Copyright 2013 by Data Blueprint
Myth #6: Big Data provides all the Answers
Fact:
• Big Data does not mean the end of
scientific theory
• Be careful or you’ll end up with
spurious correlations
– Don’t just go fishing for correlations and
hope they will explain the world
• To get to the WHY of things, you
need ideas, hypotheses and theories
• Having more data does not
substitute for thinking hard,
recognizing anomalies and exploring
deep truths
• You need the right approach
54
56. • Identify business opportunity
Copyright 2013 by Data Blueprint
• How can data be leveraged in
exploring
– External market place
• Analyze opportunities and threats
– Internal efficiencies
• Analyze strengths and weaknesses
56
57. Example: 2012 Olympic Summer Games
Copyright 2013 by Data Blueprint
1. Volume: 845 million FB users averaging 15 TB
+ of data/day
2. Velocity: 60 GB of data per second
3. Variety: 8.5 billion devices connected
4. Variability: Sponsor data, athlete data, etc.
5. Vitality: Data Art project “Emoto”
6. Virtual: Social media
57
58. • Based on my 6 V analysis, do I need a Big Data solution
Copyright 2013 by Data Blueprint
or does my current BI solution address my business
opportunity?
– Do the 6 Vs indicate general Big Data characteristics?
– What are the limitations of my current Bi environment?
(Technology constraint)
– What are my budgetary restrictions? (Financial constraint)
– What is my current Big Data knowledge base? (Knowledge
constraint)
58
59. • MUST have both
Foundational and
Technical practice
expertise
60
Copyright 2013 by Data Blueprint
61. • Data Strategy
Copyright 2013 by Data Blueprint
• Data Governance
• Data Architecture
• Data Education
61
62. • Data Quality
Copyright 2013 by Data Blueprint
• Data Integration
• Data Platforms
• BI/Analytics
62
63. • Needs to be actionable
• Generally well understood by
business
• Document what has been learned
Copyright 2013 by Data Blueprint
63
64. • Perfect results are not
necessary
• Reiterate and refine
• Iterative process to
reach decision point
• Use as feedback for
next exploration
Copyright 2013 by Data Blueprint
64
66. Myth #7: You need Big Data for Insights
Fact:
• Distinction between Big Data and
doing analytics
– Big Data is defined by the technology stack
that you use
– Big Data is used for predictive and
prescriptive analytics
• Use existing data for reporting, figure
out bottlenecks and optimize current
business model
• Understand how is your data
structured, architected and stored
Copyright 2013 by Data Blueprint
66
67. A Framework for Implementing NoSQL, Hadoop
Demystifying Big Data 2.0: Developing the Right Approach for Implementing Big Data Techniques
• Big Data Context
– We are using the wrong vocabulary to discuss this topic
• More Precise Definitions
– Framework
– Non Von Neuman Architectures
– Hadoop/Nosql
• Big Data
– Historical Perspective
• Big Data Approach
– Crawl, Walk, Run
• Framework Examples
– Social
– Operational BWB
• Take Aways and Q&A
68Copyright 2015 by Data Blueprint Slide #
Tweeting now at: #dataed
68. Social Sentiment Analysis
• One of the burgeoning areas
for use of Big Data / Hadoop
platforms.
• Allows for the landing of
multiple sources of
unstructured data. (Twitter,
Facebook, Linked In, etc.)
• Data than can be analyzed
with algorithms looking for
keywords that determine
positive/negative feedback
Copyright 2013 by Data Blueprint
69
69. Operational Use
• Utilize real time pricing data from multiple sources to dynamically
update the pricing for books in the Amazon Marketplace.
• Ingested data from multiple sources looking for real time changes
in price.
• Would apply predictive model to determine best price point and set
price of their books on the marketplace.
• Increased conversion rate, but created a race to the bottom
situation if not monitored
Copyright 2013 by Data Blueprint
79
70. Healthcare Example: Patient Data
Copyright 2013 by Data Blueprint
• Clinical data:
– Diagnosis/prognosis/treatment
– Genetic data
• Patient demographic data
• Insurance data:
– Insurance provider
– Claims data
• Prescriptions & pharmacy information
• Physical fitness data
– Activity tracking through
smartphone apps & social media
• Health history
• Medical research data
70
71. http://www.forbes.com/sites/xerox/2013/09/27/big-data-boosts-customer-loyalty-no-really/
Copyright 2013 by Data Blueprint
Retail Example: Loyalty Programs & Big Data
• Companies need to understand current wants and needs AND
predict future tendencies
• Customer -> Repeat Customer -> Brand Advocate
• Customer loyalty programs & retention strategies
– Track what is being purchased and how often
– Coupons based on purchasing history
– Targeted communications, campaigns & special offers
– Social media for additional interactions
– Personalize consumer interactions
• Customer purchase history influences
product placements
– Retailers rapidly respond to consumer demands
– Product placements, planogram optimization, etc.
71
72. References
Copyright 2013 by Data Blueprint
• The Human Face of Big Data, Rick Smolan & Jennifer Erwitt, First Edition edition (November
20, 2012)
• McKinsey: Big Data: The next frontier for innovation, competition and productivity
(http://www.mckinsey.com/insights/business_technology/
big_data_the_next_frontier_for_innovation?p=1)
• The Washington Post: Five Myths about Big Data (http://articles.washingtonpost.com/
2013-08-16/opinions/41416362_1_big-data-data-crunching-marketing-analytics)
• Gartner: Gartner’s 2013 Hype Cycle for Emerging Technologies Maps Out Evolving
Relationship Between Humans and Machines (http://www.gartner.com/newsroom/id/
2575515)
• The New York Times | Opinion Pages: What Data Can’t Do (http://www.nytimes.com/
2013/02/19/opinion/brooks-what-data-cant-do.html?_r=1&)
• CIO.com: Five Steps for How to Better Manage Your Data (http://www.cio.com.au/article/
429681/five_steps_how_better_manage_your_data/)
• Business Insider: Enterprises Aren’t Spending Wildly on ‘Big Data’But Don’t Know If It’s
Worth It Yet (http://www.businessinsider.com/enterprise-big-data-
spending-2012-11#ixzz2cdT8shhe)
• Inc.com: Big Data, Big Money: IT Industry to Increase Spending (http://www.inc.com/
kathleen-kim/big-data-spending-to-increase-for-it-industry.html)
• Forbes: Big Data Boosts Customer Loyalty. No, Really. (http://www.forbes.com/sites/xerox/
2013/09/27/big-data-boosts-customer-loyalty-no-really/)
72
73. Data Management Maturity
July 14, 2015 @ 2:00 PM ET/11:00 AM PT
Trends in Data Modeling
August 11, 2015 @ 2:00 PM ET/11:00 AM PT
Sign up here:
www.datablueprint.com/webinar-schedule
or www.dataversity.net
Upcoming Events
Copyright 2013 by Data Blueprint
73
74. 10124 W. Broad Street, Suite C
GlenAllen, Virginia 23060
804.521.4056
75. Copyright 2013 by Data Blueprint
77
Potential Tradeoffs:
CAP theorem: consistency, availability and partition-tolerance
Small datasets can be both consistent & available
Partition
(Fault)
Tolerance
AvailabilityConsistency
Atomicity
Consistency
Isolation
Durability
Basic
Availability
Soft-state
Eventual consistency
77. http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation?p=1
Copyright 2013 by Data Blueprint
5 Ways in which Data creates Business Value
1. Information is transparent
and usable at much higher
frequency
2. Expose variability and
boost performance
3. Narrow segmentation of
customers and more
precisely tailored products
or services
4. Sophisticated analytics and
improved decision-making
5. Improved development of
the next generation of
products and services
77
78. • We are at an inflection point: The
sheer volume of data generated,
stored, and mined for insights has
become economically relevant to
businesses, government, and
consumers (McKinsey)
• We believe the same important
principles still apply:
– What problem are you trying to solve for
your business? Your solution needs to fit
your problem
– Doing data for (big) data’s sake is not going
to solve any problems
– Risk of spending a lot of money on chasing
Big Data that will realize little to no returns -
especially at this hype cycle stage
http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation?p=1
Why the Big Deal about Big Data?
80
Copyright 2013 by Data Blueprint
80. Take Aways-Big Data Context
Copyright 2013 by Data Blueprint
• Technology continues to evolve at
increasing speeds
• Big Data is here
– We have the potential to
create insights
• Spend wisely & strategically:
– Big Data is not going to solve
all your problems.
• Fact:
– Big Data is not for everyone
• Fact:
– Lack of a clear definition
• Hype Cycle:
– Current: Peak of Inflated Expectations
– Soon: Trough of Disillusionment
80
81. Take Aways: Big Data Challenges Today
Copyright 2013 by Data Blueprint
• Fact: Big Data techniques are innovative but
“Big Data” is not
• Challenges are both foundational and
technical, today as well as in 1600s
• Technology continues to advance rapidly (4
Vs)
• Challenges associated with Big Data are not
new:
– Well-known foundational data management issues
– Need to align data and business with rapidly
changing environment
– Duplicity, accessibility, availability
– Foundational business issues
81
82. Take Aways-Approach: Crawl, Walk, Run
Copyright 2013 by Data Blueprint
• Crawl:
– Identify business opportunity and
determine whether you truly need
a Big Data solution
• Walk:
– Apply a combination of
foundational and technical data
management practices.
Document your insights and
make sure they are actionable
• Run:
– Recycle and explore. Staying
agile allows you to be exploratory.
82
83. Take Aways-Design Principles: Foundational & Technical
Copyright 2013 by Data Blueprint
• Foundational data management
principles still apply
• Beware of SOS (Shiny Object
Syndrome)
• You must have a data strategy before
you can have a Big Data strategy
• Fact: You don’t need Big Data to gain
insights
• Big Data integration requirements evolve
from your strategy
• Fact: Bigger Data is not always better
83
84. Take Aways: In Summary
Copyright 2013 by Data Blueprint
• Big data techniques are innovative
but “Big Data” is not
• Big Data characteristics: 6 Vs
– Volume, Velocity, Variety, Variability, Vitality,
Virtual
• Approach: Crawl-Walk-Run
• Big Data challenges require solutions
that are based on foundational and
technical data management practices
• Beware of SOS (Shiny Object
Syndrome):
– Spend wisely and strategically
– Big Data is not going to solve all your
problems
84
85. Foundational Practice: Data Strategy
• Your data strategy must
align to your organizational
business strategy and
operating model
• As the market place
becomes more data-
driven, a data-focused
business strategy is an
imperative
• Must have data strategy
before you have a Big
Data strategy
Copyright 2013 by Data Blueprint
85
86. Data Strategy Considerations
• What are the questions that
you cannot answer today?
• Is there a direct reliance on
understanding customer
behavior to drive revenue?
• Do you have information
overload and are you trying to
find the signal in the noise?
• Which is more important:
– Establishing value from current
data assets/data reporting?
– Exploring Big Data
opportunities?
Copyright 2013 by Data Blueprint
86
87. Foundational Practice: Data Architecture
• Common vocabulary expressing
integrated requirements ensuring
that data assets are stored,
arranged, managed, and used in
systems in support of
organizational strategy [Aiken
2010]
• Most organizations have data
assets that are not supportive of
strategies
• Big question:
– How can organizations more
effectively use their information
architectures to support
strategy implementation?
90
Copyright 2013 by Data Blueprint
88. Data Architecture Considerations
• Does your current architecture for
BI and analytics support Big Data?
• Are you getting enough value out of
your current architecture?
• Can you easily integrate and share
information across your
organization?
• Do you struggle to extract the value
from your data because it is too
cumbersome to navigate and
access?
• Are you confident your data is
organized to meet the needs of
your business?
Copyright 2013 by Data Blueprint
88
89. Technical Practice: Data Integration
• A data-centric
organization requires
unified data
• Integrating data across
organizational silos
creates new insights
• It is also the biggest
challenge
• Big Data techniques can
be used to complement
existing integration efforts
Copyright 2013 by Data Blueprint
89
90. Data Integration Considerations
• The complexity of your data
integration challenge depends on
the questions you’re trying to
answer
• Integration requirements for Big
Data are dependent on the types of
questions you’re asking:
– Integration here may be more fuzzy than
discrete
– Integration is domain-based (based on
time, customer concept, geographic
distribution)
• Those requirements should evolve
from your strategy
Copyright 2013 by Data Blueprint
90
91. Technical Practice: Data Quality
• Quality is driven by fit for purpose
considerations
• Big Data quality is different:
– Basic
– Availability
– Soft-state
– Eventual consistency
• Directional accuracy is the goal
• Focus on your most important data
assets and ensure our solutions
address the root cause of any quality
issues – so that your data is correct
when it is first created
• Experience has shown that
organizations can never get in front of
their data quality issues if they only use
the ‘find-and-fix’ approach
Copyright 2013 by Data Blueprint
91
92. Data Quality Considerations
• Big Data is trying to be
predictive
• What are the questions you
are trying to answer?
– What level of accuracy are you
looking for?
– What confidence levels?
– Example: Do I need to know
exactly what the customer is
going to buy or do I just need to
know the range of products he/
she is going to choose from?
Copyright 2013 by Data Blueprint
92
93. Technical Practice: Data Platforms
• Do you want to measure
critical operational process
performance?
• No one data platform can
answer all your questions. This
is commonly misunderstood
and often leads to very
expensive, bloated and
ineffective data platforms.
• Understanding the questions
that need to be asked and how
to build the right data platform
or how to optimize an existing
one
Copyright 2013 by Data Blueprint
93
94. Data Platforms Considerations
• Commonalities between most big data
stacks with file storage, columnar store,
querying engine, etc.
• Big data stack generally looks the same
until you get into appliances
– Algorithms are built into appliance
themselves, e.g. Netezza, Teradata,
etc.)
• Ask these questions:
– Do you want insights on your
customer’s behavior?
– Do you need real-time customer
transactional information?
– Do you need historical data or just
access to the latest transactions?
– Where do you go to find the single
version of the truth about your
customers?
Copyright 2013 by Data Blueprint
94