Organizations run their day-in-and-day-out businesses with transactional applications and databases. On the other hand, organizations glean insights and make critical decisions using analytical databases and business intelligence tools.
The transactional workloads are relegated to database engines designed and tuned for transactional high throughput. Meanwhile, the big data generated by all the transactions require analytics platforms to load, store, and analyze volumes of data at high speed, providing timely insights to businesses.
Thus, in conventional information architectures, this requires two different database architectures and platforms: online transactional processing (OLTP) platforms to handle transactional workloads and online analytical processing (OLAP) engines to perform analytics and reporting.
Today, a particular focus and interest of operational analytics includes streaming data ingest and analysis in real time. Some refer to operational analytics as hybrid transaction/analytical processing (HTAP), translytical, or hybrid operational analytic processing (HOAP). We’ll address if this model is a way to create efficiencies in our environments.
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...DATAVERSITY
Many data scientists are well grounded in creating accomplishment in the enterprise, but many come from outside – from academia, from PhD programs and research. They have the necessary technical skills, but it doesn’t count until their product gets to production and in use. The speaker recently helped a struggling data scientist understand his organization and how to create success in it. That turned into this presentation, because many new data scientists struggle with the complexities of an enterprise.
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
Whether to take data ingestion cycles off the ETL tool and the data warehouse or to facilitate competitive Data Science and building algorithms in the organization, the data lake – a place for unmodeled and vast data – will be provisioned widely in 2020.
Though it doesn’t have to be complicated, the data lake has a few key design points that are critical, and it does need to follow some principles for success. Avoid building the data swamp, but not the data lake! The tool ecosystem is building up around the data lake and soon many will have a robust lake and data warehouse. We will discuss policy to keep them straight, send data to its best platform, and keep users’ confidence up in their data platforms.
Data lakes will be built in cloud object storage. We’ll discuss the options there as well.
Get this data point for your data lake journey.
Denodo DataFest 2017: Outpace Your Competition with Real-Time ResponsesDenodo
Watch the presentation on-demand now: https://goo.gl/kceFTe
Today’s digital economy demands a new way of running business. Flexible access to information and responses in real time are essential for outpacing competition.
Watch this Denodo DataFest 2017 session to discover:
• Data access challenges faced by organizations today.
• How data virtualization facilitates real-time analytics.
• Key use cases and customer success stories.
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
Whether to take data ingestion cycles off the ETL tool and the data warehouse or to facilitate competitive Data Science and building algorithms in the organization, the data lake – a place for unmodeled and vast data – will be provisioned widely in 2020.
Though it doesn’t have to be complicated, the data lake has a few key design points that are critical, and it does need to follow some principles for success. Avoid building the data swamp, but not the data lake! The tool ecosystem is building up around the data lake and soon many will have a robust lake and data warehouse. We will discuss policy to keep them straight, send data to its best platform, and keep users’ confidence up in their data platforms.
Data lakes will be built in cloud object storage. We’ll discuss the options there as well.
Get this data point for your data lake journey.
Moving Targets: Harnessing Real-time Value from Data in Motion Inside Analysis
The Briefing Room with David Loshin and Datawatch
Live Webcast Feb. 17, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=4a053043c45cf0c2f6453dfb8577c72a
Patience may be a virtue, but when it comes to streaming analytics, waiting is no option. Between Big Data and the Internet of Things, businesses are faced with more data and greater complexity than ever before. Traditional information architectures simply cannot support the kind of processing necessary to make use of this fast-moving resource. The modern context requires a shorter path to analytics, one that narrows the gap between governance and discovery
Register for this episode of The Briefing Room to hear veteran Analyst David Loshin as he explains how the prevalence of streaming data is changing business pace and processes. He’ll be briefed by Dan Potter of Datawatch, who will tout his company’s real-time data discovery platform for data in motion. He will show how self-service data preparation can lead to faster insights, ultimately fostering the ability to make precise decisions at the right time.
Visit InsideAnalysis.com for more information.
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...DATAVERSITY
Many data scientists are well grounded in creating accomplishment in the enterprise, but many come from outside – from academia, from PhD programs and research. They have the necessary technical skills, but it doesn’t count until their product gets to production and in use. The speaker recently helped a struggling data scientist understand his organization and how to create success in it. That turned into this presentation, because many new data scientists struggle with the complexities of an enterprise.
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
Whether to take data ingestion cycles off the ETL tool and the data warehouse or to facilitate competitive Data Science and building algorithms in the organization, the data lake – a place for unmodeled and vast data – will be provisioned widely in 2020.
Though it doesn’t have to be complicated, the data lake has a few key design points that are critical, and it does need to follow some principles for success. Avoid building the data swamp, but not the data lake! The tool ecosystem is building up around the data lake and soon many will have a robust lake and data warehouse. We will discuss policy to keep them straight, send data to its best platform, and keep users’ confidence up in their data platforms.
Data lakes will be built in cloud object storage. We’ll discuss the options there as well.
Get this data point for your data lake journey.
Denodo DataFest 2017: Outpace Your Competition with Real-Time ResponsesDenodo
Watch the presentation on-demand now: https://goo.gl/kceFTe
Today’s digital economy demands a new way of running business. Flexible access to information and responses in real time are essential for outpacing competition.
Watch this Denodo DataFest 2017 session to discover:
• Data access challenges faced by organizations today.
• How data virtualization facilitates real-time analytics.
• Key use cases and customer success stories.
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
Whether to take data ingestion cycles off the ETL tool and the data warehouse or to facilitate competitive Data Science and building algorithms in the organization, the data lake – a place for unmodeled and vast data – will be provisioned widely in 2020.
Though it doesn’t have to be complicated, the data lake has a few key design points that are critical, and it does need to follow some principles for success. Avoid building the data swamp, but not the data lake! The tool ecosystem is building up around the data lake and soon many will have a robust lake and data warehouse. We will discuss policy to keep them straight, send data to its best platform, and keep users’ confidence up in their data platforms.
Data lakes will be built in cloud object storage. We’ll discuss the options there as well.
Get this data point for your data lake journey.
Moving Targets: Harnessing Real-time Value from Data in Motion Inside Analysis
The Briefing Room with David Loshin and Datawatch
Live Webcast Feb. 17, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=4a053043c45cf0c2f6453dfb8577c72a
Patience may be a virtue, but when it comes to streaming analytics, waiting is no option. Between Big Data and the Internet of Things, businesses are faced with more data and greater complexity than ever before. Traditional information architectures simply cannot support the kind of processing necessary to make use of this fast-moving resource. The modern context requires a shorter path to analytics, one that narrows the gap between governance and discovery
Register for this episode of The Briefing Room to hear veteran Analyst David Loshin as he explains how the prevalence of streaming data is changing business pace and processes. He’ll be briefed by Dan Potter of Datawatch, who will tout his company’s real-time data discovery platform for data in motion. He will show how self-service data preparation can lead to faster insights, ultimately fostering the ability to make precise decisions at the right time.
Visit InsideAnalysis.com for more information.
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...NoSQLmatters
Building applications on streaming data has its challenges. If you are trying to use programs such as Apache Spark or Storm to build applications, this presentation will explain the advantages and disadvantages of each solution and how to choose the right tool for your next streaming data project. Building streaming data applications that can manage the massive quantities of data generated from mobile devices, M2M, sensors and other IoT devices, is a big challenge that many organizations face today. Traditional tools, such as conventional database systems, do not have the capacity to ingest data, analyze it in real-time, and make decisions. New technologies such as Apache Spark and Storm are now coming to the forefront as possible solutions to handing fast data streams. Typical technology choices fall into one of three categories: OLAP, OLTP, and stream-processing systems. Each of these solutions has its benefits, but some choices support streaming data and application development much better than others. Employing a solution that handles streaming data, provides state, ensures durability, and supports transactions and real-time decisions is key to benefitting from fast data. During this presentation you will learn: - The difference between fast OLAP, stream-processing, and OLTP database solutions. - The importance of state, real-time analytics and real-time decisions when building applications on streaming data. - How streaming applications deliver more value when built on a super-fast in-memory, SQL database.
Data Science and Enterprise Engineering with Michael Finger and Chris RobisonDatabricks
How Data Scientists and Engineers work in tandem to achieve real-time personalization at Overstock
Personalizing online experiences for users is nothing new, but real-time personalization requires sub-second speed and close collaboration between data scientists and enterprise engineers.
Like the hands on a clock, data scientists and enterprise engineers have shifted their focus from hour- hand quickness to minute-hand speeds with a craving to take advantage of each tick of the second hand and personalize in real-time. Previously, daily activities were consumed on improving customers’ experiences tomorrow. Workflows ran overnight when on perm resources were not being tasked. The focus was on the-day-before jobs, always inching forward 24-hours behind.
Since then, we have shifted to hourly jobs and even to tasks that run every five minutes. Finally, we have been personalizing user experiences within the same day and even during the same session. But could we personalize these experiences instantly, immediately, and in real-time? What would that require? What does it look like? Michael Finger and Chris Robinson explore how data scientists and engineers are working in tandem to achieve real-time personalization at Overstock.com
The Shifting Landscape of Data IntegrationDATAVERSITY
Enterprises and organizations from every industry and scale are working to leverage data to achieve their strategic objectives — whether they are to be more profitable, effective, risk-tolerant, prepared, sustainable, and/or adaptable in an ever-changing world. Data has exploded in volume during the last decade as humans and machines alike produce data at an exponential pace. Also, exciting technologies have emerged around that data to improve our abilities and capabilities around what we can do with data.
Behind this data revolution, there are forces at work, causing enterprises to shift the way they leverage data and accelerate the demand for leverageable data. Organizations (and the climates in which they operate) are becoming more and more complex. They are also becoming increasingly digital and, thus, dependent on how data informs, transforms, and automates their operations and decisions. With increased digitization comes an increased need for both scale and agility at scale.
In this session, we have undertaken an ambitious goal of evaluating the current vendor landscape and assessing which platforms have made, or are in the process of making, the leap to this new generation of Data Management and integration capabilities.
The Big Data phenomenon is being driven by the growth of machine data. Critical insights found in machine data enable IT and Security teams to ensure uptime, detect fraud and identify threats. Today, forward-thinking organizations are discovering its value to better understand their customers, improve products, optimize marketing and improve business processes. Learn how Splunk and your machine data can deliver real-time insights from this new class of data and complement your existing BI investments.
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Denodo
Watch full webinar here: https://bit.ly/35FUn32
Presented at CDAO New Zealand
Advanced data science techniques, like machine learning, have proven an extremely useful tool to derive valuable insights from existing data. Platforms like Spark, and complex libraries for R, Python, and Scala put advanced techniques at the fingertips of the data scientists.
However, most architecture laid out to enable data scientists miss two key challenges:
- Data scientists spend most of their time looking for the right data and massaging it into a usable format
- Results and algorithms created by data scientists often stay out of the reach of regular data analysts and business users
Watch this session on-demand to understand how data virtualization offers an alternative to address these issues and can accelerate data acquisition and massaging. And a customer story on the use of Machine Learning with data virtualization.
Drive Smarter Decisions with Big Data Using Complex Event ProcessingPerficient, Inc.
This webinar described what CEP is and how it has been deployed in several client organizations to provide more agile, cost-effective and real-time integration across multiple data stores including:
Analysis of large amounts of complex, unstructured and semi-structured data
Harnessing the power big data, social/mobile data stores and BI projects for real-time decision making
Predicting events before they happen based on patterns and rules
Information processing and analytics cannot be focused only on “store-first” or batch-based approaches. To provide maximum business value, information must also be analyzed closer to the source, and at the speed in which it is being created. Streaming analytics utilizes various techniques for intelligently processing data as it arrives at the edge or within the data center, with the purpose of proactively identifying threats or opportunities for your business.
A Key to Real-time Insights in a Post-COVID World (ASEAN)Denodo
Watch full webinar here: https://bit.ly/2EpHGyd
Presented at Data Champions, Online Asia 2020
Businesses and individuals around the world are experiencing the impact of a global pandemic. With many workers and potential shoppers still sequestered, COVID-19 is proving to have a momentous impact on the global economy. Regardless of the current situation and post-pandemic era, real-time data becomes even more critical to healthcare practitioners, business owners, government officials, and the public at large where holistic and timely information are important to make quick decisions. It enables doctors to make quick decisions about where to focus the care, business owners to alter production schedules to meet the demand, government agencies to contain the epidemic, and the public to be informed about prevention.
In this on-demand session, you will learn about the capabilities of data virtualization as a modern data integration technique and how can organisations:
- Rapidly unify information from disparate data sources to make accurate decisions and analyse data in real-time
- Build a single engine for security that provides audit and control by geographies
- Accelerate delivery of insights from your advanced analytics project
Agile Big Data Analytics Development: An Architecture-Centric ApproachSoftServe
Presented at The Hawaii International Conference on System Sciences by Hong-Mei Chen and Rick Kazman (University of Hawaii), Serge Haziyev (SoftServe).
Value Amplify Consulting Group, offers the opportunity to hire Chief AI Officers trained to lead your organization in the following services, roadmaps and create your AI Playbook
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
Organizations today need a broad set of enterprise data cloud services with key data functionality to modernize applications and utilize machine learning. They need a comprehensive platform designed to address multi-faceted needs by offering multi-function data management and analytics to solve the enterprise’s most pressing data and analytic challenges in a streamlined fashion.
In this research-based session, I’ll discuss what the components are in multiple modern enterprise analytics stacks (i.e., dedicated compute, storage, data integration, streaming, etc.) and focus on total cost of ownership.
A complete machine learning infrastructure cost for the first modern use case at a midsize to large enterprise will be anywhere from $3 million to $22 million. Get this data point as you take the next steps on your journey into the highest spend and return item for most companies in the next several years.
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
Do you ever wonder how data-driven organizations fuel analytics, improve customer experience, and accelerate business productivity? They are successful by governing and mastering data effectively so they can get trusted data to those who need it faster. Efficient data discovery, mastering and democratization is critical for swiftly linking accurate data with business consumers. When business teams can quickly and easily locate, interpret, trust, and apply data assets to support sound business judgment, it takes less time to see value.
Join data mastering and data governance experts from Informatica—plus a real-world organization empowering trusted data for analytics—for a lively panel discussion. You’ll hear more about how a single cloud-native approach can help global businesses in any economy create more value—faster, more reliably, and with more confidence—by making data management and governance easier to implement.
More Related Content
Similar to Assessing New Databases– Translytical Use Cases
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...NoSQLmatters
Building applications on streaming data has its challenges. If you are trying to use programs such as Apache Spark or Storm to build applications, this presentation will explain the advantages and disadvantages of each solution and how to choose the right tool for your next streaming data project. Building streaming data applications that can manage the massive quantities of data generated from mobile devices, M2M, sensors and other IoT devices, is a big challenge that many organizations face today. Traditional tools, such as conventional database systems, do not have the capacity to ingest data, analyze it in real-time, and make decisions. New technologies such as Apache Spark and Storm are now coming to the forefront as possible solutions to handing fast data streams. Typical technology choices fall into one of three categories: OLAP, OLTP, and stream-processing systems. Each of these solutions has its benefits, but some choices support streaming data and application development much better than others. Employing a solution that handles streaming data, provides state, ensures durability, and supports transactions and real-time decisions is key to benefitting from fast data. During this presentation you will learn: - The difference between fast OLAP, stream-processing, and OLTP database solutions. - The importance of state, real-time analytics and real-time decisions when building applications on streaming data. - How streaming applications deliver more value when built on a super-fast in-memory, SQL database.
Data Science and Enterprise Engineering with Michael Finger and Chris RobisonDatabricks
How Data Scientists and Engineers work in tandem to achieve real-time personalization at Overstock
Personalizing online experiences for users is nothing new, but real-time personalization requires sub-second speed and close collaboration between data scientists and enterprise engineers.
Like the hands on a clock, data scientists and enterprise engineers have shifted their focus from hour- hand quickness to minute-hand speeds with a craving to take advantage of each tick of the second hand and personalize in real-time. Previously, daily activities were consumed on improving customers’ experiences tomorrow. Workflows ran overnight when on perm resources were not being tasked. The focus was on the-day-before jobs, always inching forward 24-hours behind.
Since then, we have shifted to hourly jobs and even to tasks that run every five minutes. Finally, we have been personalizing user experiences within the same day and even during the same session. But could we personalize these experiences instantly, immediately, and in real-time? What would that require? What does it look like? Michael Finger and Chris Robinson explore how data scientists and engineers are working in tandem to achieve real-time personalization at Overstock.com
The Shifting Landscape of Data IntegrationDATAVERSITY
Enterprises and organizations from every industry and scale are working to leverage data to achieve their strategic objectives — whether they are to be more profitable, effective, risk-tolerant, prepared, sustainable, and/or adaptable in an ever-changing world. Data has exploded in volume during the last decade as humans and machines alike produce data at an exponential pace. Also, exciting technologies have emerged around that data to improve our abilities and capabilities around what we can do with data.
Behind this data revolution, there are forces at work, causing enterprises to shift the way they leverage data and accelerate the demand for leverageable data. Organizations (and the climates in which they operate) are becoming more and more complex. They are also becoming increasingly digital and, thus, dependent on how data informs, transforms, and automates their operations and decisions. With increased digitization comes an increased need for both scale and agility at scale.
In this session, we have undertaken an ambitious goal of evaluating the current vendor landscape and assessing which platforms have made, or are in the process of making, the leap to this new generation of Data Management and integration capabilities.
The Big Data phenomenon is being driven by the growth of machine data. Critical insights found in machine data enable IT and Security teams to ensure uptime, detect fraud and identify threats. Today, forward-thinking organizations are discovering its value to better understand their customers, improve products, optimize marketing and improve business processes. Learn how Splunk and your machine data can deliver real-time insights from this new class of data and complement your existing BI investments.
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Denodo
Watch full webinar here: https://bit.ly/35FUn32
Presented at CDAO New Zealand
Advanced data science techniques, like machine learning, have proven an extremely useful tool to derive valuable insights from existing data. Platforms like Spark, and complex libraries for R, Python, and Scala put advanced techniques at the fingertips of the data scientists.
However, most architecture laid out to enable data scientists miss two key challenges:
- Data scientists spend most of their time looking for the right data and massaging it into a usable format
- Results and algorithms created by data scientists often stay out of the reach of regular data analysts and business users
Watch this session on-demand to understand how data virtualization offers an alternative to address these issues and can accelerate data acquisition and massaging. And a customer story on the use of Machine Learning with data virtualization.
Drive Smarter Decisions with Big Data Using Complex Event ProcessingPerficient, Inc.
This webinar described what CEP is and how it has been deployed in several client organizations to provide more agile, cost-effective and real-time integration across multiple data stores including:
Analysis of large amounts of complex, unstructured and semi-structured data
Harnessing the power big data, social/mobile data stores and BI projects for real-time decision making
Predicting events before they happen based on patterns and rules
Information processing and analytics cannot be focused only on “store-first” or batch-based approaches. To provide maximum business value, information must also be analyzed closer to the source, and at the speed in which it is being created. Streaming analytics utilizes various techniques for intelligently processing data as it arrives at the edge or within the data center, with the purpose of proactively identifying threats or opportunities for your business.
A Key to Real-time Insights in a Post-COVID World (ASEAN)Denodo
Watch full webinar here: https://bit.ly/2EpHGyd
Presented at Data Champions, Online Asia 2020
Businesses and individuals around the world are experiencing the impact of a global pandemic. With many workers and potential shoppers still sequestered, COVID-19 is proving to have a momentous impact on the global economy. Regardless of the current situation and post-pandemic era, real-time data becomes even more critical to healthcare practitioners, business owners, government officials, and the public at large where holistic and timely information are important to make quick decisions. It enables doctors to make quick decisions about where to focus the care, business owners to alter production schedules to meet the demand, government agencies to contain the epidemic, and the public to be informed about prevention.
In this on-demand session, you will learn about the capabilities of data virtualization as a modern data integration technique and how can organisations:
- Rapidly unify information from disparate data sources to make accurate decisions and analyse data in real-time
- Build a single engine for security that provides audit and control by geographies
- Accelerate delivery of insights from your advanced analytics project
Agile Big Data Analytics Development: An Architecture-Centric ApproachSoftServe
Presented at The Hawaii International Conference on System Sciences by Hong-Mei Chen and Rick Kazman (University of Hawaii), Serge Haziyev (SoftServe).
Value Amplify Consulting Group, offers the opportunity to hire Chief AI Officers trained to lead your organization in the following services, roadmaps and create your AI Playbook
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
Organizations today need a broad set of enterprise data cloud services with key data functionality to modernize applications and utilize machine learning. They need a comprehensive platform designed to address multi-faceted needs by offering multi-function data management and analytics to solve the enterprise’s most pressing data and analytic challenges in a streamlined fashion.
In this research-based session, I’ll discuss what the components are in multiple modern enterprise analytics stacks (i.e., dedicated compute, storage, data integration, streaming, etc.) and focus on total cost of ownership.
A complete machine learning infrastructure cost for the first modern use case at a midsize to large enterprise will be anywhere from $3 million to $22 million. Get this data point as you take the next steps on your journey into the highest spend and return item for most companies in the next several years.
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
Do you ever wonder how data-driven organizations fuel analytics, improve customer experience, and accelerate business productivity? They are successful by governing and mastering data effectively so they can get trusted data to those who need it faster. Efficient data discovery, mastering and democratization is critical for swiftly linking accurate data with business consumers. When business teams can quickly and easily locate, interpret, trust, and apply data assets to support sound business judgment, it takes less time to see value.
Join data mastering and data governance experts from Informatica—plus a real-world organization empowering trusted data for analytics—for a lively panel discussion. You’ll hear more about how a single cloud-native approach can help global businesses in any economy create more value—faster, more reliably, and with more confidence—by making data management and governance easier to implement.
What is data literacy? Which organizations, and which workers in those organizations, need to be data-literate? There are seemingly hundreds of definitions of data literacy, along with almost as many opinions about how to achieve it.
In a broader perspective, companies must consider whether data literacy is an isolated goal or one component of a broader learning strategy to address skill deficits. How does data literacy compare to other types of skills or “literacy” such as business acumen?
This session will position data literacy in the context of other worker skills as a framework for understanding how and where it fits and how to advocate for its importance.
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
Developing a Data Strategy for your organization can seem like a daunting task – but it’s worth the effort. Getting your Data Strategy right can provide significant value, as data drives many of the key initiatives in today’s marketplace – from digital transformation, to marketing, to customer centricity, to population health, and more. This webinar will help demystify Data Strategy and its relationship to Data Architecture and will provide concrete, practical ways to get started.
Uncover how your business can save money and find new revenue streams.
Driving profitability is a top priority for companies globally, especially in uncertain economic times. It's imperative that companies reimagine growth strategies and improve process efficiencies to help cut costs and drive revenue – but how?
By leveraging data-driven strategies layered with artificial intelligence, companies can achieve untapped potential and help their businesses save money and drive profitability.
In this webinar, you'll learn:
- How your company can leverage data and AI to reduce spending and costs
- Ways you can monetize data and AI and uncover new growth strategies
- How different companies have implemented these strategies to achieve cost optimization benefits
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
Organizations with governed metadata made available through their data catalog can answer questions their people have about the organization’s data. These organizations get more value from their data, protect their data better, gain improved ROI from data-centric projects and programs, and have more confidence in their most strategic data.
Join Bob Seiner for this lively webinar where he will talk about the value of a data catalog and how to build the use of the catalog into your stewards’ daily routines. Bob will share how the tool must be positioned for success and viewed as a must-have resource that is a steppingstone and catalyst to governed data across the organization.
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
Organizations with governed metadata made available through their data catalog can answer questions their people have about the organization’s data. These organizations get more value from their data, protect their data better, gain improved ROI from data-centric projects and programs, and have more confidence in their most strategic data.
Join Bob Seiner for this lively webinar where he will talk about the value of a data catalog and how to build the use of the catalog into your stewards’ daily routines. Bob will share how the tool must be positioned for success and viewed as a must-have resource that is a steppingstone and catalyst to governed data across the organization.
In this webinar, Bob will focus on:
-Selecting the appropriate metadata to govern
-The business and technical value of a data catalog
-Building the catalog into people’s routines
-Positioning the data catalog for success
-Questions the data catalog can answer
Because every organization produces and propagates data as part of their day-to-day operations, data trends are becoming more and more important in the mainstream business world’s consciousness. For many organizations in various industries, though, comprehension of this development begins and ends with buzzwords: “Big Data,” “NoSQL,” “Data Scientist,” and so on. Few realize that all solutions to their business problems, regardless of platform or relevant technology, rely to a critical extent on the data model supporting them. As such, data modeling is not an optional task for an organization’s data effort, but rather a vital activity that facilitates the solutions driving your business. Since quality engineering/architecture work products do not happen accidentally, the more your organization depends on automation, the more important the data models driving the engineering and architecture activities of your organization. This webinar illustrates data modeling as a key activity upon which so much technology and business investment depends.
Specific learning objectives include:
- Understanding what types of challenges require data modeling to be part of the solution
- How automation requires standardization on derivable via data modeling techniques
- Why only a working partnership between data and the business can produce useful outcomes
Analytics play a critical role in supporting strategic business initiatives. Despite the obvious value to analytic professionals of providing the analytics for these initiatives, many executives question the economic return of analytics as well as data lakes, machine learning, master data management, and the like.
Technology professionals need to calculate and present business value in terms business executives can understand. Unfortunately, most IT professionals lack the knowledge required to develop comprehensive cost-benefit analyses and return on investment (ROI) measurements.
This session provides a framework to help technology professionals research, measure, and present the economic value of a proposed or existing analytics initiative, no matter the form that the business benefit arises. The session will provide practical advice about how to calculate ROI and the formulas, and how to collect the necessary information.
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
Data Mesh is a trending approach to building a decentralized data architecture by leveraging a domain-oriented, self-service design. However, the pure definition of Data Mesh lacks a center of excellence or central data team and doesn’t address the need for a common approach for sharing data products across teams. The semantic layer is emerging as a key component to supporting a Hub and Spoke style of organizing data teams by introducing data model sharing, collaboration, and distributed ownership controls.
This session will explain how data teams can define common models and definitions with a semantic layer to decentralize analytics product creation using a Hub and Spoke architecture.
Attend this session to learn about:
- The role of a Data Mesh in the modern cloud architecture.
- How a semantic layer can serve as the binding agent to support decentralization.
- How to drive self service with consistency and control.
Enterprise data literacy. A worthy objective? Certainly! A realistic goal? That remains to be seen. As companies consider investing in data literacy education, questions arise about its value and purpose. While the destination – having a data-fluent workforce – is attractive, we wonder how (and if) we can get there.
Kicking off this webinar series, we begin with a panel discussion to explore the landscape of literacy, including expert positions and results from focus groups:
- why it matters,
- what it means,
- what gets in the way,
- who needs it (and how much they need),
- what companies believe it will accomplish.
In this engaging discussion about literacy, we will set the stage for future webinars to answer specific questions and feature successful literacy efforts.
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
Change is hard, especially in response to negative stimuli or what is perceived as negative stimuli. So organizations need to reframe how they think about data privacy, security and governance, treating them as value centers to 1) ensure enterprise data can flow where it needs to, 2) prevent – not just react – to internal and external threats, and 3) comply with data privacy and security regulations.
Working together, these roles can accelerate faster access to approved, relevant and higher quality data – and that means more successful use cases, faster speed to insights, and better business outcomes. However, both new information and tools are required to make the shift from defense to offense, reducing data drama while increasing its value.
Join us for this panel discussion with experts in these fields as they discuss:
- Recent research about where data privacy, security and governance stand
- The most valuable enterprise data use cases
- The common obstacles to data value creation
- New approaches to data privacy, security and governance
- Their advice on how to shift from a reactive to resilient mindset/culture/organization
You’ll be educated, entertained and inspired by this panel and their expertise in using the data trifecta to innovate more often, operate more efficiently, and differentiate more strategically.
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
With technological innovation and change occurring at an ever-increasing rate, it’s hard to keep track of what’s hype and what can provide practical value for your organization. Join this webinar to see the results of a recent DATAVERSITY survey on emerging trends in Data Architecture, along with practical commentary and advice from industry expert Donna Burbank.
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
As DATAVERSITY’s RWDG series hurdles into our 12th year, this webinar takes a quick look behind us, evaluates the present, and predicts the future of Data Governance. Based on webinar numbers, hot Data Governance topics have evolved over the years from policies and best practices, roles and tools, data catalogs and frameworks, to supporting data mesh and fabric, artificial intelligence, virtualization, literacy, and metadata governance.
Join Bob Seiner as he reflects on the past and what has and has not worked, while sharing examples of enterprise successes and struggles. In this webinar, Bob will challenge the audience to stay a step ahead by learning from the past and blazing a new trail into the future of Data Governance.
In this webinar, Bob will focus on:
- Data Governance’s past, present, and future
- How trials and tribulations evolve to success
- Leveraging lessons learned to improve productivity
- The great Data Governance tool explosion
- The future of Data Governance
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
Would you share your bank account information on social media? How about shouting your social security number on the New York City subway? We didn’t think so either – that’s why data governance is consistently top of mind.
In this webinar, we’ll discuss the common Cloud data governance best practices – and how to apply them today. Join us to uncover Google Cloud’s investment in data governance and learn practical and doable methods around key management and confidential computing. Hear real customer experiences and leave with insights that you can share with your team. Let’s get solving.
Topics that you will hear addressed in this webinar:
- Understanding the basics of Cloud Incident Response (IR) and anticipated data governance trends
- Best practices for key management and apply data governance to your day-to-day
- The next wave of Confidential Computing and how to get started, including a demo
It is a fascinating, explosive time for enterprise analytics.
It is from the position of analytics leadership that the enterprise mission will be executed and company leadership will emerge. The data professional is absolutely sitting on the performance of the company in this information economy and has an obligation to demonstrate the possibilities and originate the architecture, data, and projects that will deliver analytics. After all, no matter what business you’re in, you’re in the business of analytics.
The coming years will be full of big changes in enterprise analytics and data architecture. William will kick off the fifth year of the Advanced Analytics series with a discussion of the trends winning organizations should build into their plans, expectations, vision, and awareness now.
Too often I hear the question “Can you help me with our data strategy?” Unfortunately, for most, this is the wrong request because it focuses on the least valuable component: the data strategy itself. A more useful request is: “Can you help me apply data strategically?” Yes, at early maturity phases the process of developing strategic thinking about data is more important than the actual product! Trying to write a good (must less perfect) data strategy on the first attempt is generally not productive –particularly given the widespread acceptance of Mike Tyson’s truism: “Everybody has a plan until they get punched in the face.” This program refocuses efforts on learning how to iteratively improve the way data is strategically applied. This will permit data-based strategy components to keep up with agile, evolving organizational strategies. It also contributes to three primary organizational data goals. Learn how to improve the following:
- Your organization’s data
- The way your people use data
- The way your people use data to achieve your organizational strategy
This will help in ways never imagined. Data are your sole non-depletable, non-degradable, durable strategic assets, and they are pervasively shared across every organizational area. Addressing existing challenges programmatically includes overcoming necessary but insufficient prerequisites and developing a disciplined, repeatable means of improving business objectives. This process (based on the theory of constraints) is where the strategic data work really occurs as organizations identify prioritized areas where better assets, literacy, and support (data strategy components) can help an organization better achieve specific strategic objectives. Then the process becomes lather, rinse, and repeat. Several complementary concepts are also covered, including:
- A cohesive argument for why data strategy is necessary for effective data governance
- An overview of prerequisites for effective strategic use of data strategy, as well as common pitfalls
- A repeatable process for identifying and removing data constraints
- The importance of balancing business operation and innovation
Who Should Own Data Governance – IT or Business?DATAVERSITY
The question is asked all the time: “What part of the organization should own your Data Governance program?” The typical answers are “the business” and “IT (information technology).” Another answer to that question is “Yes.” The program must be owned and reside somewhere in the organization. You may ask yourself if there is a correct answer to the question.
Join this new RWDG webinar with Bob Seiner where Bob will answer the question that is the title of this webinar. Determining ownership of Data Governance is a vital first step. Figuring out the appropriate part of the organization to manage the program is an important second step. This webinar will help you address these questions and more.
In this session Bob will share:
- What is meant by “the business” when it comes to owning Data Governance
- Why some people say that Data Governance in IT is destined to fail
- Examples of IT positioned Data Governance success
- Considerations for answering the question in your organization
- The final answer to the question of who should own Data Governance
It is clear that Data Management best practices exist and so does a useful process for improving existing Data Management practices. The question arises: Since we understand the goal, how does one design a process for Data Management goal achievement? This program describes what must be done at the programmatic level to achieve better data use and a way to implement this as part of your data program. The approach combines DMBoK content and CMMI/DMM processes – permitting organizations with the opportunity to benefit from the best of both. It also permits organizations to understand:
- Their current Data Management practices
- Strengths that should be leveraged
- Remediation opportunities
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
MLOps is a practice for collaboration between Data Science and operations to manage the production machine learning (ML) lifecycles. As an amalgamation of “machine learning” and “operations,” MLOps applies DevOps principles to ML delivery, enabling the delivery of ML-based innovation at scale to result in:
Faster time to market of ML-based solutions
More rapid rate of experimentation, driving innovation
Assurance of quality, trustworthiness, and ethical AI
MLOps is essential for scaling ML. Without it, enterprises risk struggling with costly overhead and stalled progress. Several vendors have emerged with offerings to support MLOps: the major offerings are Microsoft Azure ML and Google Vertex AI. We looked at these offerings from the perspective of enterprise features and time-to-value.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
1. Assessing New
Databases: Translytical
Use Cases
Presented by: William McKnight
“#1 Global Influencer in Big Data” Thinkers360
President, McKnight Consulting Group
A 2 time Inc. 5000 Company
@williammcknight
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET
With William McKnight
2. Enterprises
Analysts
Vendors
• Keynote/Webinar Presentations – Online & In-Person – Great
turnouts.
• White Paper Development – Use our unique voice to talk about a
theme important to you and tie your product to it.
• Benchmark Services – Performance, Ease-of-Use, Functionality, TCO.
We’ve done 40+ benchmarks; TPCs, others; databases (analytical,
operational) and related (lake, integration, APIs, etc.). Impactful.
• Day in the Life of Report – We go from zero to production and
document the steps, creating comfort for the buyer to make the next
step.
• Teardown – Comparing and grading vs. competition across 50 +/-
factors. Ideal for building product roadmaps.
• Competitive Education – We teach vendor competitive teams about
the competition with ½ day – 1 day hands-on workshops per
competitor.
• Technical Specification Development – i.e., Deployment Guide, Best
Practices Guide, Reference Architecture.
• Test Drives for demonstrations/booth – We build real-world relatable
test drives/demos you can use to show off features or performance.
• *NEW* McKnight Enterprise Contribution Ranking Report – We’re
taking an industry and assessing market leaders against critical
capabilities of the market. Industries available for prioritizing
research.
• Total Addressable Market Report.
McKnight Consulting Group Vendor Offerings
3. OLTP vs OLAP
OLTP
• Process business
interactions as they occur
• Support limited query
• Focus on IUD/individual
transactions
• Low latency and high
throughput needed
• ACID compliance
• Normalized data model
OLAP
• Analytics/complex
analysis
• Offload of processing
from OLTP
• Dimensional data model
• Lite data modification
from source
• Complex queries,
frequently long-running
• Large data accumulation
3
4. Capability Requirements
• Analytics on live data, recent data and
historical data
• Real-time analytics calculated from across
data domains
• Pre-calculated data
• Live analytics usable operationally
• A seamless platform
• Operational SLAs
4
5. Analytics Defined
• Analytics is the process of utilizing data to enhance
business processes.
• Analytics is deeper than simple knowledge; they have
depth.
• There’s Analytic Projects and…
• There’s Analytics Added to projects
7. Benefits of Real-Time Analytics
• Speed to Insight
• Customer Experience
• Operational Excellence
• Deeper Understanding
7
8. Translytical Use Cases
• Portfolio Management
• Wealth Management
• Fraud Analytics
• Risk Management
• Algorithmic Trading
• Crypto Exchange
• SC/IoT Analytics
• Real-Time Customer
Experience
• Network Telemetry
• Geolocation Analysis
• Field Support Optimization
• Ad Optimization & Ad
Serving
• Streaming Media Quality
Analytics
• Real-Time
Recommendations
• Video Games
• Telemetry Processing
• IoT & Smart Meter Analytics
• Predictive Maintenance
• Geospatial Tracking
9. Next Best Offer/Touch
• Need to incorporate not only analytics
through last night, but also today, all
morning, last hour and last second into
screen render
• Need to incorporate not just the user’s data
but all users data
– Need to correlate user to other users instantly
• Only AI can operate at the needed scale
9
10. Financial Market
• Billions of API Requests Daily
• Need 5-10ms Average Query Response
• Data to include:
– Real-time and historical stock price
– Cryptocurrencies, Forex, Commodities,
Currencies, Premium Data
• Front-Office Traders Need Real-Time
Analysis
11. Healthcare
• Genomic medicine
• Virtual visits
• Tele-health and AI Triage
• AI Diagnostics
• Robotics Automating Lab Work
11
12. Retailer
• Better & personalized product recommendations for the consumers based on session
data, historical order data, and trending products.
• Continuous and automatic retraining the recommendation (ML) engine.
• Near real-time data integration from their retail application to the analytical
platform.
• Identify potential compliance issues with customer data, classify and tag sensitive
data with labels, and track how sensitive data is being used from the data source to
the reports.
• Integrate other systems such as their SAP ERP, email and instant messaging platforms
with the analytical solution to get a full 360 view over their business operations and
to improve customer satisfaction.
• Save in operational costs while offering the best customer experience even during
peak seasons such as Black Friday, Thanksgiving, Christmas, and Mother’s Day.
12
13. Metaverse
• VR chairs, vests, scent generators, and
better directional sound systems
• Avatars fully virtual agents
• Surgical implants to the metaverse
13
14. Transportation
• Driverless and autonomous
• Floating or vertical warehouses delivering
packages
• Urban transportation
• Airbus drone-like popup concept
14
15. Cameras and Audio Recording
• Cameras Will Be Abundant
• Person’s Profile Will Be Evident
• Third-Party Analytics
• AI Will Decide …
15
16. Manufacturing
• Real-Time Dashboards
• Variety of data sources
• When they ingest data they must recalculate
the entire dataset because business rules
change over time
• Cross-matching survey results at the team and
individual level
• Need to know what impact various dimensions,
such as product quality, support, cost, and
more have on their NPS score
• Processes that formerly required 10 steps are
streamlined down to just one.
16
17. Asset Management
• End-to-end asset visibility
• Needed one place to discover all assets in
environment
– With instant context around risk, vulnerability,
threat assessment and threat detection
• 100 billion events per day
– devices, firewalls, IoT, multi-tenant, ServiceNow
and network traffic
17
18. Security Surveillance
• Goal to view all sites in a single, cloud-
based package
– And offer analytics from video data
• Real-Time Insights
• Biggest challenge was scalability
18
19. Finance: Embedded
• Started with easy to prototype, ingest data,
do basic reports
– Required replica sets
• Performance constraints on writes to
PostgreSQL
• Had to do a bulk load of the data and it was
so time-consuming that certain data was
skipped
19
20. eSports
• Need to offer real-time and historical live
streaming data to analyze trends and
performance across all genres, games,
events, and channels
• Need to work with thousands of time series
data points in complex multi-gigabyte
aggregated queries.
• Analytics speed is the top priority
• Understand spikes in viewership
21
21. Data Architecture Needs for Translytical
Workloads
• Fast Streaming Ingest (millions of
events/second)
• Low Latency
• High Concurrency (thousands of concurrent
users)
• Unlimited Storage
• Pipelines
• Transactional Consistency
22. Data Architecture Ill-fit for Translytical
Logs
(Apps, Web,
Devices)
User tracking
Operational
Metrics
Offload
data
Raw Data Topics
JSON, AVRO
Processed
Data Topics
Sensors
and
/ or
Transactiona
l/ Context
Data
OLTP/ODS
ETL
Or
EL with
T in Spark
Batch
Low
Latency
Applications
Files
In-
database
analytics
Reach
through
or ETL/ELT
or
Stream
Processing
or
Stream
Processing
Q
Q
Data
Warehouse
Data Lake
25. NoSQL for Operational Big Data
More data model flexibility
– Web Services as a data model
– No “schema first” requirement; load first
Faster time to insight from data acquisition
Relaxed ACID
– Eventual consistency
– Willing to trade consistency for availability
– ACID would crush things like storing clicks on Google
Low upfront software and development costs
Fault-tolerant redundancy
Linear Scaling to “webscale”
26
27. Single Product Architectures
• Single Table Storage for Transactions and
Analytics
– Fast IUD and Query
– Simplified Data Architecture
– Reduced Data Movement
• Rowstore + Columnstore
28
28. Columnstore
• SingleStore uses two storage types
internally: an in-memory rowstore and a
disk-based columnstore
• Columnstore:
29
29. Azure Synapse Analytics
Data
Processing
Transactional
Database
ML Model Training
& Deployment
Azure Machine
Learning
Azure Cosmos
DB
Core API
E-Commerce
Website
Azure Kubernetes Service (AKS)
Front-end Back-end
Cart
Profile
Products
Stock
Azure ML managed
online endpoint
Deployed
Recommender
Azure Cosmos
DB
Analytical Store
(Parquet)
Analytical
Store (HTAP)
ADLS Gen2
(data lake)
Data Lake +
Historical Data
Automatic
Model deployment
Synapse
Link
Enables
automatic
sync
to
analytical
store
(no
ETL)
Data Management
& Governance
Microsoft
Purview
Classify &
protect
sensitive data
(customer
profiles, etc.)
Power BI
Report & Visualize
Power Apps M365
Dataverse
Synapse Link
Enterprise Data Sources
Synapse Pipelines
Azure Real-Time Environment
30. Amazon Redshift
Data
Processing
Transactional
Database
ML Model Training
& Deployment
Amazon SageMaker
Amazon
DynamoDB
E-Commerce
Website
Amazon Elastic Kubernetes Service (Amazon EKS)
Front-end Back-end
Cart
Profile
Products
Stock
SageMaker
model endpoint
Deployed
Recommender
Data Loading
S3
(data lake)
Data Lake +
Historical Data
Automatic
Model deployment
Amazon Glue
Data Governance:
• AWS Partner solutions
• AWS Marketplace solutions
AWS Real-Time Environment
31. Single Product
Logs
(Apps, Web,
Devices)
User tracking
Operational
Metrics
Offload
data
Raw Data Topics
JSON, AVRO
Processed
Data Topics
Sensors
and
/ or
ETL
Or
EL with
T in Spark
Batch
Low
Latency
Applications
Files
Transactional/
Context Data OLTP
Reach
through
or
Stream
Processing
Data
Warehouse
Data Lake
32. Single Vendor Solutions
• SingleStore
• Oracle
• Snowflake Unistore
• Cassandra
• Azure
• AWS
• Google
33. Tweak on Traditional Architectures
Logs
(Apps, Web,
Devices)
User tracking
Operational
Metrics
Offload
data
Raw Data Topics
JSON, AVRO
Processed
Data Topics
Sensors
and
/ or
Transactiona
l/ Context
Data
OLTP/ODS
ETL
Or
EL with
T in Spark
Batch
Low
Latency
Applications
Files
In-
database
analytics
Reach
through
or ETL/ELT
or
Stream
Processing
or
Stream
Processing
Q
Q
Data
Warehouse
Data Lake
Analytics
35. Benchmark
• We found a single database competitiveness in operating effectively, actually putting
it in a winning position for both transactional and operational workloads.
– The use of a single database facilitates operational analytics and offers an efficient
approach for any organization.
• For the TPC H-like workload, it obtained a geometric mean better than both of the
pure-play data warehouses.
• In the TPC DS-like workload, an analytic db was superior, both with maintenance and
without maintenance. Its 4.1 geometric mean outperformed the 1 db without
maintenance, while its 3.9 with maintenance likewise bested the 1 db.
• Given the vast superiority in transactional processing and the high competitiveness in
analytic processing, the efficiencies of one database—the 1 db —across the spectrum
of enterprise needs should be considered.
• Platform costs favor the 1 db by 1.9x over 1 analytic db and 2.5x over the other in
Year 1.
• Development costs are 2.5x – 3x and Production Costs are 2.1x – 2.5x for the analytic
db.
• We calculated the annual costs of the platform stacks and the Time-Effort Costs
(People Costs, Development Costs and Production Costs) and concluded that the 1 db
is 2 times cheaper than 1 analytic stack and 2.5 times cheaper than the other over 3
years running enterprise-equivalent workloads.
36. Summary
• Applications are moving translytical as the lines between operational and
analytical blur
• Analytics are deeper than simple knowledge; they have depth
• The need for real-time analytics drives the need for a translytical
architecture
• There are examples in every industry
• Traditional architectures do not meet the requirements
• There are multiple vendor, multiple product/same vendor and single product
options
• Single product solutions combine Rowstore + Columnstore
• Given the vast superiority in transactional processing and the high
competitiveness in analytic processing, the efficiencies of one database—the
1 db —across the spectrum of enterprise needs should be considered
37. Upcoming Topics
• Assessing New Database Capabilities: Multi-Model
• MLOps: Applying DevOps to Competitive Advantage
• 2023 Trends in Enterprise Analytics
• Showing ROI for your Analytic Project
• Architecture, Products and Total Cost of Ownership of
the Leading Machine Learning Stacks
39
Second Thursday of Every Month, at 2:00 ET
38. Assessing New
Databases: Translytical
Use Cases
Presented by: William McKnight
“#1 Global Influencer in Big Data” Thinkers360
President, McKnight Consulting Group
A 2 time Inc. 5000 Company
@williammcknight
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET
#AdvAnalytics
Editor's Notes
HTAP, HOAP, Operlytical, Event-driven*
The dw is not dead, but it is dying.
Data lake.
Analytics needed even if they are not traditionally stored together (e.g. real-time customer event data alongside CRM data; network sensor data alongside marketing campaign management data)
Between the pre-calc and the live is design…. On-demand vs. Continuous Real-time Analytics
Projects that are classified as “analytics”. And there’s analytics added to projects. At some point, all projects are becoming analytic projects which makes it fair to just measure the project roi.
Red light…. Stop.
Car running low on gas, gas station 20 minutes away, home 30 minutes away, tomorrow not as busy can gas in morning.
Speed to Insight: The primary benefit of real-time analytics is of course speed. It speeds up time to insight and lets businesses work faster to make necessary changes to systems or act on any critical information discovered. This can help organizations not only flag potential problems and mitigate risk, but also seize opportunities when they matter.
Customer Experience: Real-time analytics can help businesses anticipate problems and streamline operations to improve the overall customer experience. These on-the-fly adjustments greatly influence customer interactions and can help improve the end-to-end experience.
Operational Excellence: Real-time analytics allows organizations to gain a clear view of the business and understand what needs to be done to address potential operational issues. It also allows users to understand what resources are available to make those changes.
Deeper Understanding: When there is a need for deeper analytics to make a business decision, real-time analytics can help compare real-time and historical data to inform the decision.
Most r/t arch about r/t ingest
Keywords: real-time, real-time analytics, operational excellence, operational analytics, reatl-time DW: Real-time analytics essentially means that data is provided for analysis almost immediately once it is collected.
Way of the future; No 2nd store of data: DW
Maybe somebody just became a correlated user
theme
Foreign exchange
Premium data comes from a growing community of curated partners, such as:
Wall Street Horizon
Fraud Factors
Audit Analytics
ValuEngine
Stocktwits
And much more
Recalls
Outbreaks
Latest findings
pandemic footprint
Human beings have roughly 20,500 genes, in DNA, housed in each and every one of the trillions of cells that make you who you are. What cases what action… it’s complicated. Batch anal needed.
360 incl the now
MV is about simulation.
Avatars able to act, within tightly defined parameters, as our agents, our companions, and some may even be considered co-workers.
unable to tell the difference between a virtualized real person and an AI-driven avatar.
we will vrutally be able to travel the world and experience life on other planets, all from home. Metaverse will give a feeling of actually being there with your family/friends.
parallel life in the Metaverse. It has become absolutely necessary for your existence. It is very difficult to be operational outside of the Metaverse. You are connected via multiple devices, wearables and even brain chips. You live in a mixed reality where physical and digital converge. Many people opt to spend most of their day in virtual worlds where they can become whoever they want and live the way they always dreamed. unlimited freedoms in their personal virtual worlds - no liniits.
NFTs and crypto …. Take off later. bitcoin will displace the US dollar as the primary form of global finance by 2050
Traffic and weather – current and patterns. Constantly changing.
Imagine this: You walk into a furnitue showroom virtually and before you say anything, the store knows your name, employment status, car-buying history, and credit rating. ADD: Where you’ve been today, clothes you’re wearing, etc.
Already, data brokers such as Acxiom and LexisNexis compile reams of information on all of us. Clients can purchase a dossier on your criminal, consumer, and marital past. it’s only a matter of time before data brokers begin drawing from online-dating profiles and social-media posts as well.
Right now, clients have to log in and search for people by name or buy lists of people with certain traits. But as facial-recognition technology becomes more widespread, , any device with a camera and the right software could automatically pull up your information.
Eventually, someone might be able to point a phone at you (or look at you through smart contact lenses) and see a bubble over your head marking you as unemployed or recently divorced. We’ll no longer be able to separate our work selves from our weekend selves. Instead our histories will come bundled as a pop-up on strangers’ screens.
With the advent of the Internet of Things, appliances and gadgets will monitor many aspects of our lives, from what we eat to what we flush. Devices we talk to will record and upload our conversations, as Amazon’s Echo already does. Even toys will make us vulnerable. Kids say the darndest things, and the talking Hello Barbie doll sends those things wirelessly to a third-party server, where they are analyzed by speech-recognition software and shared with vendors.
Even our thoughts could become hackable. The technology company Retinad can use the sensors on virtual-reality headsets to track users’ engagement. Future devices might integrate electrodes to measure brain waves. In August, Berkeley engineers announced that they had produced “neural dust,” implantable electrodes just a millimeter wide that can record brain activity for scientific or medical purposes.
Chicago police use an algorithm that analyzes arrest records, social networks, and other data to identify future criminals.
xxx previously had to run advanced analytics offline xxx. “if you looked at the dashboard and wanted to drill through, the waiting times were longer than 2 seconds. If it's not instant or very close to instant, it becomes painful. At that point, people just don’t do the analytics and valuable information is lost. If you don't use it, and if you don't analyze, you can’t find these things, you're not going to improve your business.”
The data sources could be almost anything, from databases to IoT devices.
They can now drill down into things like NPS to get at the root cause of a score, drive those insights back into the business, improve their scores, and most importantly, retain their end customers
. In the past, xxx had to provide these analytic insights by moving data into SPSS, which was painfully slow. With a translytical appraoch, they can now slice and dice data in real time, and in the NPS example, instantly understand the validity of a data correlation.
Armis originally launched its platform using a PostgreSQL database. Over time, the time-based data set got too large for Postgres to handle. At this point the team migrated this data set from 400+ PostgresSQL databases into a huge Elasticsearch cluster (160 nodes). the entire data pipeline including ElasticSearch cost more than $1 million annually.
Embedded finance is when non-financial companies offer their customers access to credit through their technology platform. Customers can be individuals or businesses, and the credit can be offered by the company or by a third party.
The replicated data needed to be re-ingested, and the dashboard only refreshed once every 24 hours, leading to a serious and unacceptable lag in data freshness. didn’t store a lot of information due to performance constraints on writes to PostgreSQL, and data from other partners was nearly impossible to obtain so Ant Money could enrich its first-party data. Ant Money had to do a bulk load of the data and it was so time-consuming that certain data was skipped.
s customers are the biggest companies in the eSports industry: game publishers, eSports organizers, and other brands.
& Help companies analyze eSports data so they can understand how they’re doing and ways they can optimize time and resources
Both real-time and historical data were needed to provide the full context of the live streams and eSports events. The data ingestion pipeline included manual metadata input, third-party fact tables, and automated systems.
Ppl on 1 side.
Most of the time post-op is “learning”
1 db solutions: Doing analytics with operational DBs or tying together multiple databases to power their applications with analytics
Is it more anal needing op or op needing anal, which way coming? It’s anal trying to do op & fail . Also mysql, postgresql to anal fail.
r/t for access
Single source
databricks
Lake+dw. All points of integration are points of failure.
Data lakes (cloud stg) emerged to handle raw data in a variety of formats on cheap storage for data science and machine learning, though lacked critical features from the world of data warehouses: they do not support transactions, they do not enforce data quality well, and their lack of consistency/isolation makes it almost impossible to mix appends and reads, and batch and streaming jobs.
There are a few key technology advancements that have enabled the data lakehouse:
metadata layers for data lakes to set up drill through paths
new query engine designs providing high-performance SQL like execution on data lakes
access for data science and machine learning tools.
Lake concerns itself w/ DQ not offloads
all of the major data platform vendors have converged their messaging around the concept of a lakehouse architecture that takes the best attributes of traditional data warehouses and enables them to run on platforms with data- lake storage architectures.
Column stores, key-value stores, document stores\
Data fit for nosql
Linkedin 80million messages/sec
Readers don’t need to wait on writers. Each version of the row is stored as a fixed-sized struct (variable-
length fields are stored as pointers) according to the table schema, along with bookkeeping information such as the timestamp and the commit status of the version.
Oracle (Oracle has a dual store approach rather than a single store like SingleStoreDB's Universal Storage)
Snowflake has announced combining transactional and #analytical data with Unistore.
SingleStore, ….what we encounter is customers trying to do analytics with operational DBs or tying together multiple databases to power their applications with analytics.
We also replace a lot of the 1st gen operational DBs (such as the MySQL, Postgres, RDS) and also augment data warehouses or Hadoop to power real-time analytics.
Microsoft solution
Microsoft Azure and Microsoft Intelligent Data Platform
Azure Kubernetes Service
Azure Cosmos DB
Synapse Link for Cosmos DB
Synapse Analytics, Synapse Pipelines, ADLS Gen2
Azure ML
Power BI
Microsoft Purview
AWS solution
Amazon Elastic Kubernetes Service (Amazon EKS)
Amazon DynamoDB
Amazon Glue
Amazon Redshift, S3
Amazon SageMaker
For Data Governance: 3rd party Marketplace/Partner solutions
GCP solution
Google Kubernetes Engine (GKE)
Cloud Firestore
Cloud Data Fusion
BigQuery, Cloud Storage, Cloud Dataprep, Cloud Dataflow
Vertex AI Prediction, Vertex AI
For Data Governance: Dataplex, requires a separate Dataplex lake
Diff: 1 vendor, 1 product
Oracle has a dual store approach rather than a single store like SingleStoreDB's Universal Storage). You need (i) basic Oracle DB license (ii) diag + tuning pack (iii) Oracle RAC option (iv) Exadata (for the columnar compression/performance) (v) partitioning pack (vi) Active Data guard option
Snowflake has announced combining transactional and #analytical data with Unistore
Some Ppl on 2 sides.
1. Data lakes can be difficult to manage and govern due to their size and complexity. 2. Data lakes can be difficult to extract from regularly due to the variety and volume of data they contain.
Need to add a cache like Redis
Op db often nosql
Organizations are often reluctant to attempt analyzing real-time data, fearing the analytical workload will hamper the performance of the operational work that has to be the priority.
2 anal db, 1 1 db solution
2 anal db, 1 1 db solution
Some of these BP you’ll see next month in mature env