Mike Ferguson is Managing Director of Intelligent Business Strategies Limited and specializes in business intelligence/analytics and data management. He discusses building the artificially intelligent enterprise and transitioning to a self-learning enterprise. Some key challenges discussed include the siloed and fractured nature of current data and analytics efforts, with many tools and scripts in use without integration. He advocates sorting out the data foundation, implementing DataOps and MLOps, creating a data and analytics marketplace, and integrating analytics into business processes to drive value from AI.
When it comes to creating an enterprise AI strategy: if your company isn’t good at analytics, it’s not ready for AI. Succeeding in AI requires being good at data engineering AND analytics. Unfortunately, management teams often assume they can leapfrog best practices for basic data analytics by directly adopting advanced technologies such as ML/AI – setting themselves up for failure from the get-go. This presentation explains how to get basic data engineering and the right technology in place to create and maintain data pipelines so that you can solve problems with AI successfully.
Thabo Ndlela- Leveraging AI for enhanced Customer Service and Experienceitnewsafrica
Thabo Ndlela, from Accenture, delivered a keynote on Leveraging AI for enhanced Customer Service and Experience at Digital Finance Africa 2023 on the 2nd of August 2023.
Learn to identify use cases for machine learning (ML), acquire best practices to frame problems in a way that key stakeholders and senior management can understand and support, and help create the right conditions for delivering successful ML-based solutions to your business.
Artificial intelligence is reshaping business, and the time is ripe for companies to capitalise AI. The organisation can use AI to move their focus from discrete business problems to significant business challenges.
An organisation should use ML and Data Science to drive digital transformation for more back-office operational efficiency, better user/engagement, smoother onboarding, and better ROI by lowering cost and bring more data-driven taking mechanism for transparency.
AI will be a valuable, transformational change agent not only to the way business is done but to the way people live their daily lives if it isn't perceived as a plug-and-play technology with immediate returns but more like a long term solution to rewire the organisation.
Exploring Opportunities in the Generative AI Value Chain.pdfDung Hoang
The article "Exploring Opportunities in the Generative AI Value Chain" by McKinsey & Company's QuantumBlack provides insights into the value created by generative artificial intelligence (AI) and its potential applications.
Introdution to Dataops and AIOps (or MLOps)Adrien Blind
This presentation introduces the audience to the DataOps and AIOps practices. It deals with organizational & tech aspects, and provide hints to start you data journey.
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersIvo Andreev
Have you ever wondered why GPT models work? Do you ask questions like:
◉ How does GPT work? Why does the same problem receive different answers for different users? Is there a way to improve explainability? ◉ Can GPT model provide its sources? Why does Bing chat work differently? What are my ways to have better performance and improve completions? ◉ How can I work with data in my enterprise? What practical business cases could a generative AI model fit solving?
If you are tired of sessions just scratching the surface of OpenAI GPT, this one will go deeper and answer questions like why, why not and how.
Key Terms; ChatGPT Enterprise; Top Questions; Enterprise Data; Azure Search; Functions; Embeddings; Context Encoding; General Intelligence; Emerging Abilities; Chain of Thought; Plugins; Multimodal with DALL-E; Project Florence
When it comes to creating an enterprise AI strategy: if your company isn’t good at analytics, it’s not ready for AI. Succeeding in AI requires being good at data engineering AND analytics. Unfortunately, management teams often assume they can leapfrog best practices for basic data analytics by directly adopting advanced technologies such as ML/AI – setting themselves up for failure from the get-go. This presentation explains how to get basic data engineering and the right technology in place to create and maintain data pipelines so that you can solve problems with AI successfully.
Thabo Ndlela- Leveraging AI for enhanced Customer Service and Experienceitnewsafrica
Thabo Ndlela, from Accenture, delivered a keynote on Leveraging AI for enhanced Customer Service and Experience at Digital Finance Africa 2023 on the 2nd of August 2023.
Learn to identify use cases for machine learning (ML), acquire best practices to frame problems in a way that key stakeholders and senior management can understand and support, and help create the right conditions for delivering successful ML-based solutions to your business.
Artificial intelligence is reshaping business, and the time is ripe for companies to capitalise AI. The organisation can use AI to move their focus from discrete business problems to significant business challenges.
An organisation should use ML and Data Science to drive digital transformation for more back-office operational efficiency, better user/engagement, smoother onboarding, and better ROI by lowering cost and bring more data-driven taking mechanism for transparency.
AI will be a valuable, transformational change agent not only to the way business is done but to the way people live their daily lives if it isn't perceived as a plug-and-play technology with immediate returns but more like a long term solution to rewire the organisation.
Exploring Opportunities in the Generative AI Value Chain.pdfDung Hoang
The article "Exploring Opportunities in the Generative AI Value Chain" by McKinsey & Company's QuantumBlack provides insights into the value created by generative artificial intelligence (AI) and its potential applications.
Introdution to Dataops and AIOps (or MLOps)Adrien Blind
This presentation introduces the audience to the DataOps and AIOps practices. It deals with organizational & tech aspects, and provide hints to start you data journey.
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersIvo Andreev
Have you ever wondered why GPT models work? Do you ask questions like:
◉ How does GPT work? Why does the same problem receive different answers for different users? Is there a way to improve explainability? ◉ Can GPT model provide its sources? Why does Bing chat work differently? What are my ways to have better performance and improve completions? ◉ How can I work with data in my enterprise? What practical business cases could a generative AI model fit solving?
If you are tired of sessions just scratching the surface of OpenAI GPT, this one will go deeper and answer questions like why, why not and how.
Key Terms; ChatGPT Enterprise; Top Questions; Enterprise Data; Azure Search; Functions; Embeddings; Context Encoding; General Intelligence; Emerging Abilities; Chain of Thought; Plugins; Multimodal with DALL-E; Project Florence
GPT and Graph Data Science to power your Knowledge GraphNeo4j
In this workshop at Data Innovation Summit 2023, we demonstrated how you could learn from the network structure of a Knowledge Graph and use OpenAI’s GPT engine to populate and enhance your Knowledge Graph.
Key takeaways:
1. How Knowledge Graphs grow organically
2. How to deploy Graph Algorithms to learn from the topology of a graph
3. Integrate a Knowledge Graph with OpenAI’s GPT
4. Use Graph Node embeddings to feed Machine Learning workflow
🔹How will AI-based content-generating tools change your mission and products?
🔹This complimentary webinar [ON-DEMAND] explores multiple use cases that drive adoption in their early adopter customer base to provide product leaders with insights into the future of generative AI-powered businesses, and the potential generative AI holds for driving innovation and improving business processes.
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...James Serra
Discover, manage, deploy, monitor – rinse and repeat. In this session we show how Azure Machine Learning can be used to create the right AI model for your challenge and then easily customize it using your development tools while relying on Azure ML to optimize them to run in hardware accelerated environments for the cloud and the edge using FPGAs and Neural Network accelerators. We then show you how to deploy the model to highly scalable web services and nimble edge applications that Azure can manage and monitor for you. Finally, we illustrate how you can leverage the model telemetry to retrain and improve your content.
leewayhertz.com-The architecture of Generative AI for enterprises.pdfKristiLBurns
Generative AI is quickly becoming popular among enterprises, with various applications being developed that can change how businesses operate. From code generation to product design and engineering, generative AI impacts a range of enterprise applications.
Dreamforce 23: Where Salesforce Meets AIAjeet Singh
Dive into the future of business transformation at Dreamforce 2023: Where Salesforce Meets AI. Join us as we explore the cutting-edge synergy between two game-changing technologies – Salesforce and Artificial Intelligence. Uncover how businesses are leveraging AI to supercharge their Salesforce platforms, revolutionizing customer engagement, data insights, and decision-making.
Discover real-world success stories, innovative strategies, and hands-on demonstrations that showcase the seamless integration of AI into Salesforce, unlocking unparalleled opportunities for growth and efficiency.
Don't miss this opportunity to be at the forefront of the next evolution in business technology.
Conversational AI and Chatbot IntegrationsCristina Vidu
Conversational AI and Chatbots (or rather - and more extensively - Virtual Agents) offer great benefits, especially in combination with technologies like RPA or IDP. Corneliu Niculite (Presales Director - EMEA @DRUID AI) and Roman Tobler (CEO @Routinuum & UiPath MVP) are discussing Conversational AI and why Virtual Agents play a significant role in modern ways of working. Moreover, Corneliu will be displaying how to build a Workflow and showcase an Accounts Payable Use Case, integrating DRUID and UiPath Robots.
📙 Agenda:
The focus of our meetup is around the following areas - with a lot of room to discuss and share experiences:
- What is "Conversational AI" and why do we need Chatbots (Virtual Agents);
- Deep-Dive to a DRUID-UiPath Integration via an Accounts Payable Use Case;
- Discussion, Q&A
Speakers:
👨🏻💻 Corneliu Niculite, Presales Director - EMEA DRUID AI
👨🏼💻 Roman Tobler, UiPath MVP, Co-Founder & CEO Routinuum GmbH
This session streamed live on March 8, 2023, 16:00 PM CET.
Check out our upcoming events at: community.uipath.com
Contact us at: community@uipath.com
AZConf 2023 - Considerations for LLMOps: Running LLMs in productionSARADINDU SENGUPTA
With the recent explosion in development and interest in large language, vision and speech models, it has become apparent that running large models in production will be a key driver in enterprise adoption of ML. Traditional MLOps, i.e. running machine learning models in production, already has so many variabilities to address starting from data integrity, data drift and model optimization. Running a large model (language or vision) in production keeping in mind business requirements is different altogether. In this talk, I will try to explain the general framework for LLMOps and certain considerations while designing a system for inferencing a large model.
This talk will be covered in sub-topics:
1. Model Optimization
2. Model fine-tuning
3. Model Editing
4. Model Serving and deployment
5. Model metrics monitoring
6. Embedding and artifact management
In each sub-topic, a brief understanding of the current open-source tool sets will also be mentioned so that tool-chain selection is a bit easier.
The path to success with Graph Database and Graph Data ScienceNeo4j
What’s new and what’s next? Product innovation moves rapidly at Neo4j – learn how graph technology can provide you with the tools to get much more from your data!
This describes a conceptual model approach to designing an enterprise data fabric. This is the set of hardware and software infrastructure, tools and facilities to implement, administer, manage and operate data operations across the entire span of the data within the enterprise across all data activities including data acquisition, transformation, storage, distribution, integration, replication, availability, security, protection, disaster recovery, presentation, analytics, preservation, retention, backup, retrieval, archival, recall, deletion, monitoring, capacity planning across all data storage platforms enabling use by applications to meet the data needs of the enterprise.
The conceptual data fabric model represents a rich picture of the enterprise’s data context. It embodies an idealised and target data view.
Designing a data fabric enables the enterprise respond to and take advantage of key related data trends:
• Internal and External Digital Expectations
• Cloud Offerings and Services
• Data Regulations
• Analytics Capabilities
It enables the IT function demonstrate positive data leadership. It shows the IT function is able and willing to respond to business data needs. It allows the enterprise to meet data challenges
• More and more data of many different types
• Increasingly distributed platform landscape
• Compliance and regulation
• Newer data technologies
• Shadow IT where the IT function cannot deliver IT change and new data facilities quickly
It is concerned with the design an open and flexible data fabric that improves the responsiveness of the IT function and reduces shadow IT.
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
MLOps is a practice for collaboration between Data Science and operations to manage the production machine learning (ML) lifecycles. As an amalgamation of “machine learning” and “operations,” MLOps applies DevOps principles to ML delivery, enabling the delivery of ML-based innovation at scale to result in:
Faster time to market of ML-based solutions
More rapid rate of experimentation, driving innovation
Assurance of quality, trustworthiness, and ethical AI
MLOps is essential for scaling ML. Without it, enterprises risk struggling with costly overhead and stalled progress. Several vendors have emerged with offerings to support MLOps: the major offerings are Microsoft Azure ML and Google Vertex AI. We looked at these offerings from the perspective of enterprise features and time-to-value.
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Matt Stubbs
Date: 14th November 2018
Location: Governance and MDM Theatre
Time: 10:30 - 11:00
Speaker: Mike Ferguson
Organisation: IBS
About: For most organisations today, data complexity has increased rapidly. In the area of operations, we now have cloud and on-premises OLTP systems with customers, partners and suppliers accessing these applications via APIs and mobile apps. In the area of analytics, we now have data warehouse, data marts, big data Hadoop systems, NoSQL databases, streaming data platforms, cloud storage, cloud data warehouses, and IoT-generated data being created at the edge. Also, the number of data sources is exploding as companies ingest more and more external data such as weather and open government data. Silos have also appeared everywhere as business users are buying in self-service data preparation tools without consideration for how these tools integrate with what IT is using to integrate data. Yet new regulations are demanding that we do a better job of governing data, and business executives are demanding more agility to remain competitive in a digital economy. So how can companies remain agile, reduce cost and reduce the time-to-value when data complexity is on the up?
In this session, Mike will discuss how companies can create an information supply chain to manufacture business-ready data and analytics to reduce time to value and improve agility while also getting data under control.
GPT and Graph Data Science to power your Knowledge GraphNeo4j
In this workshop at Data Innovation Summit 2023, we demonstrated how you could learn from the network structure of a Knowledge Graph and use OpenAI’s GPT engine to populate and enhance your Knowledge Graph.
Key takeaways:
1. How Knowledge Graphs grow organically
2. How to deploy Graph Algorithms to learn from the topology of a graph
3. Integrate a Knowledge Graph with OpenAI’s GPT
4. Use Graph Node embeddings to feed Machine Learning workflow
🔹How will AI-based content-generating tools change your mission and products?
🔹This complimentary webinar [ON-DEMAND] explores multiple use cases that drive adoption in their early adopter customer base to provide product leaders with insights into the future of generative AI-powered businesses, and the potential generative AI holds for driving innovation and improving business processes.
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...James Serra
Discover, manage, deploy, monitor – rinse and repeat. In this session we show how Azure Machine Learning can be used to create the right AI model for your challenge and then easily customize it using your development tools while relying on Azure ML to optimize them to run in hardware accelerated environments for the cloud and the edge using FPGAs and Neural Network accelerators. We then show you how to deploy the model to highly scalable web services and nimble edge applications that Azure can manage and monitor for you. Finally, we illustrate how you can leverage the model telemetry to retrain and improve your content.
leewayhertz.com-The architecture of Generative AI for enterprises.pdfKristiLBurns
Generative AI is quickly becoming popular among enterprises, with various applications being developed that can change how businesses operate. From code generation to product design and engineering, generative AI impacts a range of enterprise applications.
Dreamforce 23: Where Salesforce Meets AIAjeet Singh
Dive into the future of business transformation at Dreamforce 2023: Where Salesforce Meets AI. Join us as we explore the cutting-edge synergy between two game-changing technologies – Salesforce and Artificial Intelligence. Uncover how businesses are leveraging AI to supercharge their Salesforce platforms, revolutionizing customer engagement, data insights, and decision-making.
Discover real-world success stories, innovative strategies, and hands-on demonstrations that showcase the seamless integration of AI into Salesforce, unlocking unparalleled opportunities for growth and efficiency.
Don't miss this opportunity to be at the forefront of the next evolution in business technology.
Conversational AI and Chatbot IntegrationsCristina Vidu
Conversational AI and Chatbots (or rather - and more extensively - Virtual Agents) offer great benefits, especially in combination with technologies like RPA or IDP. Corneliu Niculite (Presales Director - EMEA @DRUID AI) and Roman Tobler (CEO @Routinuum & UiPath MVP) are discussing Conversational AI and why Virtual Agents play a significant role in modern ways of working. Moreover, Corneliu will be displaying how to build a Workflow and showcase an Accounts Payable Use Case, integrating DRUID and UiPath Robots.
📙 Agenda:
The focus of our meetup is around the following areas - with a lot of room to discuss and share experiences:
- What is "Conversational AI" and why do we need Chatbots (Virtual Agents);
- Deep-Dive to a DRUID-UiPath Integration via an Accounts Payable Use Case;
- Discussion, Q&A
Speakers:
👨🏻💻 Corneliu Niculite, Presales Director - EMEA DRUID AI
👨🏼💻 Roman Tobler, UiPath MVP, Co-Founder & CEO Routinuum GmbH
This session streamed live on March 8, 2023, 16:00 PM CET.
Check out our upcoming events at: community.uipath.com
Contact us at: community@uipath.com
AZConf 2023 - Considerations for LLMOps: Running LLMs in productionSARADINDU SENGUPTA
With the recent explosion in development and interest in large language, vision and speech models, it has become apparent that running large models in production will be a key driver in enterprise adoption of ML. Traditional MLOps, i.e. running machine learning models in production, already has so many variabilities to address starting from data integrity, data drift and model optimization. Running a large model (language or vision) in production keeping in mind business requirements is different altogether. In this talk, I will try to explain the general framework for LLMOps and certain considerations while designing a system for inferencing a large model.
This talk will be covered in sub-topics:
1. Model Optimization
2. Model fine-tuning
3. Model Editing
4. Model Serving and deployment
5. Model metrics monitoring
6. Embedding and artifact management
In each sub-topic, a brief understanding of the current open-source tool sets will also be mentioned so that tool-chain selection is a bit easier.
The path to success with Graph Database and Graph Data ScienceNeo4j
What’s new and what’s next? Product innovation moves rapidly at Neo4j – learn how graph technology can provide you with the tools to get much more from your data!
This describes a conceptual model approach to designing an enterprise data fabric. This is the set of hardware and software infrastructure, tools and facilities to implement, administer, manage and operate data operations across the entire span of the data within the enterprise across all data activities including data acquisition, transformation, storage, distribution, integration, replication, availability, security, protection, disaster recovery, presentation, analytics, preservation, retention, backup, retrieval, archival, recall, deletion, monitoring, capacity planning across all data storage platforms enabling use by applications to meet the data needs of the enterprise.
The conceptual data fabric model represents a rich picture of the enterprise’s data context. It embodies an idealised and target data view.
Designing a data fabric enables the enterprise respond to and take advantage of key related data trends:
• Internal and External Digital Expectations
• Cloud Offerings and Services
• Data Regulations
• Analytics Capabilities
It enables the IT function demonstrate positive data leadership. It shows the IT function is able and willing to respond to business data needs. It allows the enterprise to meet data challenges
• More and more data of many different types
• Increasingly distributed platform landscape
• Compliance and regulation
• Newer data technologies
• Shadow IT where the IT function cannot deliver IT change and new data facilities quickly
It is concerned with the design an open and flexible data fabric that improves the responsiveness of the IT function and reduces shadow IT.
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
MLOps is a practice for collaboration between Data Science and operations to manage the production machine learning (ML) lifecycles. As an amalgamation of “machine learning” and “operations,” MLOps applies DevOps principles to ML delivery, enabling the delivery of ML-based innovation at scale to result in:
Faster time to market of ML-based solutions
More rapid rate of experimentation, driving innovation
Assurance of quality, trustworthiness, and ethical AI
MLOps is essential for scaling ML. Without it, enterprises risk struggling with costly overhead and stalled progress. Several vendors have emerged with offerings to support MLOps: the major offerings are Microsoft Azure ML and Google Vertex AI. We looked at these offerings from the perspective of enterprise features and time-to-value.
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Matt Stubbs
Date: 14th November 2018
Location: Governance and MDM Theatre
Time: 10:30 - 11:00
Speaker: Mike Ferguson
Organisation: IBS
About: For most organisations today, data complexity has increased rapidly. In the area of operations, we now have cloud and on-premises OLTP systems with customers, partners and suppliers accessing these applications via APIs and mobile apps. In the area of analytics, we now have data warehouse, data marts, big data Hadoop systems, NoSQL databases, streaming data platforms, cloud storage, cloud data warehouses, and IoT-generated data being created at the edge. Also, the number of data sources is exploding as companies ingest more and more external data such as weather and open government data. Silos have also appeared everywhere as business users are buying in self-service data preparation tools without consideration for how these tools integrate with what IT is using to integrate data. Yet new regulations are demanding that we do a better job of governing data, and business executives are demanding more agility to remain competitive in a digital economy. So how can companies remain agile, reduce cost and reduce the time-to-value when data complexity is on the up?
In this session, Mike will discuss how companies can create an information supply chain to manufacture business-ready data and analytics to reduce time to value and improve agility while also getting data under control.
Accelerate Self-Service Analytics with Data Virtualization and VisualizationDenodo
Watch full webinar here: https://bit.ly/3fpitC3
Enterprise organizations are shifting to self-service analytics as business users need real-time access to holistic and consistent views of data regardless of its location, source or type for arriving at critical decisions.
Data Virtualization and Data Visualization work together through a universal semantic layer. Learn how they enable self-service data discovery and improve performance of your reports and dashboards.
In this session, you will learn:
- Challenges faced by business users
- How data virtualization enables self-service analytics
- Use case and lessons from customer success
- Overview of the highlight features in Tableau
Enterprise Data Marketplace: A Centralized Portal for All Your Data AssetsDenodo
Watch full webinar here: https://bit.ly/3OLv0jY
Organizations continue to collect mounds of data and it is spread over different locations and in different formats. The challenge is navigating the vastness and complexity of the modern data ecosystem to find the right data to suit your specific business purpose. Data is an important corporate asset and it needs to be leveraged but also protected.
By adopting an alternate approach to data management and adapting a logical data architecture, data can be democratized while providing centralized control within a distributed data landscape. The web-based Data Catalog tool a single access point for secure enterprise-wide data access and governance. This corporate data marketplace provides visibility into your data ecosystem and allows data to be shared without compromising data security policies.
Catch this on-demand session to understand how this approach can transform how you leverage data across the business:
- Empower the knowledge worker with data and increase productivity
- Promote data accuracy and trust to encourage re-use of important data assets
- Apply consistent security and governance policies across the enterprise data landscape
Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)Denodo
Watch full webinar here: https://bit.ly/3aWI8lt
Self-service is a major goal of modern data strategists. A successfully implemented self-service initiative means that business users have access to holistic and consistent views of data regardless of its location, source or type. As data unification and data collaboration become key critical success factors for organisations, data catalogs play a key role as the perfect companion for a virtual layer to fully empower those self-service initiatives and build a self-service data marketplace requiring minimal IT intervention.
Denodo’s Data Catalog is a key piece in Denodo’s portfolio to bridge the gap between the technical data infrastructure and business users. It provides documentation, search, governance and collaboration capabilities, and data exploration wizards. It provides business users with the tool to generate their own insights with proper security, governance, and guardrails.
In this session we will cover:
- The role of a virtual semantic layer in self-service initiatives
- Key ingredients of a successful self-service data marketplace
- Self-service (consumption) vs. inventory catalogs
- Best practices and advanced tips for successful deployment
- A Demonstration: Product Demo
- Examples of customers using Denodo’s Data Catalog to enable self-service initiatives
Die Big Data Fabric als Enabler für Machine Learning & AIDenodo
Ansehen: https://bit.ly/2Cet17K
Erstklassige Big Data Fabrics liefern verlässliche Insights, gewährleisten höchste End-to-End Sicherheitsstandards und ermöglichen eine konsistente Datenintegration in Echtzeit – während den Business-Anwendern agile Werkzeuge zum selbstgesteuerten Datenkonsum bereitgestellt werden.
Erfahren Sie in dem Vortrag, wie die Big Data Fabric als Enabler für ML & AI:
- den Business-Anwendern und Data Scientists einen schnellen und agilen Datenzugriff via Self-Services ermöglicht
- Data Governance und Security Richtlinien zentral und verlässlich managebar macht
- relevante Insights aus aktuellen und konsistenten Daten liefert
Empowering your Enterprise with a Self-Service Data Marketplace (ASEAN)Denodo
Watch full webinar here: https://bit.ly/3uqcAN0
Self-service is a major goal of modern data strategists. A successfully implemented self-service initiative means that business users have access to holistic and consistent views of data regardless of its location, source or type. As data unification and data collaboration become key critical success factors for organizations, data catalogs play a key role as the perfect companion for a virtual layer to fully empower those self-service initiatives and build a self-service data marketplace requiring minimal IT intervention.
Denodo’s Data Catalog is a key piece in Denodo’s portfolio to bridge the gap between the technical data infrastructure and business users. It provides documentation, search, governance and collaboration capabilities, and data exploration wizards. It provides business users with the tool to generate their own insights with proper security, governance, and guardrails.
In this session we will cover:
- The role of a virtual semantic layer in self-service initiatives
- Key ingredients of a successful self-service data marketplace Self-service (consumption) vs. inventory catalogs
- Best practices and advanced tips for successful deployment
- A Demonstration: Product Demo
- Examples of customers using Denodo’s Data Catalog to enable self-service initiatives
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteCaserta
The “Big Data era” has ushered in an avalanche of new technologies and approaches for delivering information and insights to business users. What is the role of the cloud in your analytical environment? How can you make your migration as seamless as possible? This closing keynote, delivered by Joe Caserta, a prominent consultant who has helped many global enterprises adopt Big Data, provided the audience with the inside scoop needed to supplement data warehousing environments with data intelligence—the amalgamation of Big Data and business intelligence.
This presentation was given as the closing keynote at DBTA's annual Data Summit in NYC.
Four Key Considerations for your Big Data Analytics StrategyArcadia Data
Learn 4 of the key things to consider as you create your big data analytics strategy from John Meyers (Enterprise Management Associates) and Steve Wooledge (Arcadia Data).
Analyst Webinar: Best Practices In Enabling Data-Driven Decision MakingDenodo
Watch full webinar here: https://bit.ly/37YkgN4
This presentation looks at the trends that are emerging from companies on their journeys to becoming data-driven enterprises.
These trends are taken from a survey of 500 companies and highlight critical success factors, what companies are doing, their progress so far and their plans going forward. It also looks at the role that data virtualization has within the data driven enterprise.
During the session we'll address:
- What is a data-driven enterprise?
- What are the critical success factors?
- What are companies doing to create a data-driven enterprise and why?
- What progress are they making?
- What are the plans on people, process and technologies?
- Why is data virtualization central to provisioning and accessing data in a data-driven enterprise?
- How should you get started?
Building a New Platform for Customer Analytics Caserta
Caserta Concepts and Databricks partner up to bring you this insightful webinar on how a business can choose from all of the emerging big data technologies to figure out which one best fits their needs.
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
Whether to take data ingestion cycles off the ETL tool and the data warehouse or to facilitate competitive Data Science and building algorithms in the organization, the data lake – a place for unmodeled and vast data – will be provisioned widely in 2020.
Though it doesn’t have to be complicated, the data lake has a few key design points that are critical, and it does need to follow some principles for success. Avoid building the data swamp, but not the data lake! The tool ecosystem is building up around the data lake and soon many will have a robust lake and data warehouse. We will discuss policy to keep them straight, send data to its best platform, and keep users’ confidence up in their data platforms.
Data lakes will be built in cloud object storage. We’ll discuss the options there as well.
Get this data point for your data lake journey.
Want to know more about Common Data Model and Service? You need to understant what's the difference between CDS for Apps and Analytics? Feel free to use these slides and send me your feed backs.
Agile Mumbai 2022
Real-Time Insights and AI for better Products, Customer experience and Resilient Platform
Balvinder Kaur
Principal Consultant, Thoughtworks
Sushant Joshi
Product Manager, Thoughtworks
Modernizing Integration with Data VirtualizationDenodo
Watch full webinar here: https://bit.ly/3CMqS0E
Today, businesses have more data and data types combined with more complex ecosystems than they have ever had before. Examples include on-premise data marts, data warehouses, data lakes, applications, spreadsheets, IoT data, sensor data, unstructured, etc. combined with cloud data ecosystems like Snowflake, Big Query, Azure Synapse, Amazon S3, Redshift, Databricks, SaaS apps, such as Salesforce, Oracle, Service Now, Workday, and on and on.
Data, Analytics, Data Science and Architecture teams are struggling to provide the business users with the right data as quickly and efficiently as possible to quickly enable Analytics, Dashboards, BI, Reports, etc. Unfortunately, many enterprises seek to meet this pressing need by utilizing antiquated and legacy 40+ year-old approaches. There is a better way. Proven by thousands of other companies.
As Forrester so astutely reported in their recent Total Economic Impact Study, companies who employed Data Virtualization reported a “65% decrease in data delivery times over ETL” and an “83% reduction in time to new revenue.”
Join us for this very educational webinar to learn firsthand from Denodo Technologies and Fusion Alliance how:
- Data Virtualization helps your company save time and money by eliminating superfluous ETL pipelines and data replication.
- Data Virtualization can become the cornerstone of your modern data approach to deliver data faster and more efficiently than old legacy approaches at enterprise scale.
- How quickly and easily, Data Virtualization can scale, even in the most complex environments, to create a universal abstraction semantic model(s) for all of your cloud, on premise, structured, unstructured and hybrid data
- Data Mesh and Data Fabric architecture patterns for maximum reuse
- Other customers have used, and are using, Data Virtualization to tackle their toughest data integration and data delivery challenges
- Fusion Alliance can help you define a data strategy tailored to your organization’s needs and requirements, and how they can help you achieve success and enable your business with self-service capabilities
Empowering Business & IT Teams: Modern Data Catalog RequirementsPrecisely
As the demand for data-driven insights continues to grow, the importance of data catalogs will only increase. A modern data catalog addresses new use cases requiring more immediate and intelligent data discovery to drive complete and informed business outcomes.
In this demo, you will hear how the Precisely Data Integrity Suite’s Data Catalog is the connective tissue that empowers business and IT teams to discover, understand, and trust their critical data. Requirements to meet those new use cases include:
· Discovery, lineage, and relationships across silos for more informed insights
· Interoperability with data platforms and tech stacks to increase ROI
· Machine learning to drive more significant insights
· Data observability to alert users to data changes and anomalies
· Business-friendly data governance to advance understanding & accountability
Similar to Building the Artificially Intelligent Enterprise (20)
Data Lakehouse Symposium | Day 1 | Part 1Databricks
The world of data architecture began with applications. Next came data warehouses. Then text was organized into a data warehouse.
Then one day the world discovered a whole new kind of data that was being generated by organizations. The world found that machines generated data that could be transformed into valuable insights. This was the origin of what is today called the data lakehouse. The evolution of data architecture continues today.
Come listen to industry experts describe this transformation of ordinary data into a data architecture that is invaluable to business. Simply put, organizations that take data architecture seriously are going to be at the forefront of business tomorrow.
This is an educational event.
Several of the authors of the book Building the Data Lakehouse will be presenting at this symposium.
Data Lakehouse Symposium | Day 1 | Part 2Databricks
The world of data architecture began with applications. Next came data warehouses. Then text was organized into a data warehouse.
Then one day the world discovered a whole new kind of data that was being generated by organizations. The world found that machines generated data that could be transformed into valuable insights. This was the origin of what is today called the data lakehouse. The evolution of data architecture continues today.
Come listen to industry experts describe this transformation of ordinary data into a data architecture that is invaluable to business. Simply put, organizations that take data architecture seriously are going to be at the forefront of business tomorrow.
This is an educational event.
Several of the authors of the book Building the Data Lakehouse will be presenting at this symposium.
The world of data architecture began with applications. Next came data warehouses. Then text was organized into a data warehouse.
Then one day the world discovered a whole new kind of data that was being generated by organizations. The world found that machines generated data that could be transformed into valuable insights. This was the origin of what is today called the data lakehouse. The evolution of data architecture continues today.
Come listen to industry experts describe this transformation of ordinary data into a data architecture that is invaluable to business. Simply put, organizations that take data architecture seriously are going to be at the forefront of business tomorrow.
This is an educational event.
Several of the authors of the book Building the Data Lakehouse will be presenting at this symposium.
The world of data architecture began with applications. Next came data warehouses. Then text was organized into a data warehouse.
Then one day the world discovered a whole new kind of data that was being generated by organizations. The world found that machines generated data that could be transformed into valuable insights. This was the origin of what is today called the data lakehouse. The evolution of data architecture continues today.
Come listen to industry experts describe this transformation of ordinary data into a data architecture that is invaluable to business. Simply put, organizations that take data architecture seriously are going to be at the forefront of business tomorrow.
This is an educational event.
Several of the authors of the book Building the Data Lakehouse will be presenting at this symposium.
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
In this session, learn how to quickly supplement your on-premises Hadoop environment with a simple, open, and collaborative cloud architecture that enables you to generate greater value with scaled application of analytics and AI on all your data. You will also learn five critical steps for a successful migration to the Databricks Lakehouse Platform along with the resources available to help you begin to re-skill your data teams.
Democratizing Data Quality Through a Centralized PlatformDatabricks
Bad data leads to bad decisions and broken customer experiences. Organizations depend on complete and accurate data to power their business, maintain efficiency, and uphold customer trust. With thousands of datasets and pipelines running, how do we ensure that all data meets quality standards, and that expectations are clear between producers and consumers? Investing in shared, flexible components and practices for monitoring data health is crucial for a complex data organization to rapidly and effectively scale.
At Zillow, we built a centralized platform to meet our data quality needs across stakeholders. The platform is accessible to engineers, scientists, and analysts, and seamlessly integrates with existing data pipelines and data discovery tools. In this presentation, we will provide an overview of our platform’s capabilities, including:
Giving producers and consumers the ability to define and view data quality expectations using a self-service onboarding portal
Performing data quality validations using libraries built to work with spark
Dynamically generating pipelines that can be abstracted away from users
Flagging data that doesn’t meet quality standards at the earliest stage and giving producers the opportunity to resolve issues before use by downstream consumers
Exposing data quality metrics alongside each dataset to provide producers and consumers with a comprehensive picture of health over time
Learn to Use Databricks for Data ScienceDatabricks
Data scientists face numerous challenges throughout the data science workflow that hinder productivity. As organizations continue to become more data-driven, a collaborative environment is more critical than ever — one that provides easier access and visibility into the data, reports and dashboards built against the data, reproducibility, and insights uncovered within the data.. Join us to hear how Databricks’ open and collaborative platform simplifies data science by enabling you to run all types of analytics workloads, from data preparation to exploratory analysis and predictive analytics, at scale — all on one unified platform.
Why APM Is Not the Same As ML MonitoringDatabricks
Application performance monitoring (APM) has become the cornerstone of software engineering allowing engineering teams to quickly identify and remedy production issues. However, as the world moves to intelligent software applications that are built using machine learning, traditional APM quickly becomes insufficient to identify and remedy production issues encountered in these modern software applications.
As a lead software engineer at NewRelic, my team built high-performance monitoring systems including Insights, Mobile, and SixthSense. As I transitioned to building ML Monitoring software, I found the architectural principles and design choices underlying APM to not be a good fit for this brand new world. In fact, blindly following APM designs led us down paths that would have been better left unexplored.
In this talk, I draw upon my (and my team’s) experience building an ML Monitoring system from the ground up and deploying it on customer workloads running large-scale ML training with Spark as well as real-time inference systems. I will highlight how the key principles and architectural choices of APM don’t apply to ML monitoring. You’ll learn why, understand what ML Monitoring can successfully borrow from APM, and hear what is required to build a scalable, robust ML Monitoring architecture.
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
Autonomy and ownership are core to working at Stitch Fix, particularly on the Algorithms team. We enable data scientists to deploy and operate their models independently, with minimal need for handoffs or gatekeeping. By writing a simple function and calling out to an intuitive API, data scientists can harness a suite of platform-provided tooling meant to make ML operations easy. In this talk, we will dive into the abstractions the Data Platform team has built to enable this. We will go over the interface data scientists use to specify a model and what that hooks into, including online deployment, batch execution on Spark, and metrics tracking and visualization.
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
In this talk, I will dive into the stage level scheduling feature added to Apache Spark 3.1. Stage level scheduling extends upon Project Hydrogen by improving big data ETL and AI integration and also enables multiple other use cases. It is beneficial any time the user wants to change container resources between stages in a single Apache Spark application, whether those resources are CPU, Memory or GPUs. One of the most popular use cases is enabling end-to-end scalable Deep Learning and AI to efficiently use GPU resources. In this type of use case, users read from a distributed file system, do data manipulation and filtering to get the data into a format that the Deep Learning algorithm needs for training or inference and then sends the data into a Deep Learning algorithm. Using stage level scheduling combined with accelerator aware scheduling enables users to seamlessly go from ETL to Deep Learning running on the GPU by adjusting the container requirements for different stages in Spark within the same application. This makes writing these applications easier and can help with hardware utilization and costs.
There are other ETL use cases where users want to change CPU and memory resources between stages, for instance there is data skew or perhaps the data size is much larger in certain stages of the application. In this talk, I will go over the feature details, cluster requirements, the API and use cases. I will demo how the stage level scheduling API can be used by Horovod to seamlessly go from data preparation to training using the Tensorflow Keras API using GPUs.
The talk will also touch on other new Apache Spark 3.1 functionality, such as pluggable caching, which can be used to enable faster dataframe access when operating from GPUs.
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
In this talk, I would like to introduce an open-source tool built by our team that simplifies the data conversion from Apache Spark to deep learning frameworks.
Imagine you have a large dataset, say 20 GBs, and you want to use it to train a TensorFlow model. Before feeding the data to the model, you need to clean and preprocess your data using Spark. Now you have your dataset in a Spark DataFrame. When it comes to the training part, you may have the problem: How can I convert my Spark DataFrame to some format recognized by my TensorFlow model?
The existing data conversion process can be tedious. For example, to convert an Apache Spark DataFrame to a TensorFlow Dataset file format, you need to either save the Apache Spark DataFrame on a distributed filesystem in parquet format and load the converted data with third-party tools such as Petastorm, or save it directly in TFRecord files with spark-tensorflow-connector and load it back using TFRecordDataset. Both approaches take more than 20 lines of code to manage the intermediate data files, rely on different parsing syntax, and require extra attention for handling vector columns in the Spark DataFrames. In short, all these engineering frictions greatly reduced the data scientists’ productivity.
The Databricks Machine Learning team contributed a new Spark Dataset Converter API to Petastorm to simplify these tedious data conversion process steps. With the new API, it takes a few lines of code to convert a Spark DataFrame to a TensorFlow Dataset or a PyTorch DataLoader with default parameters.
In the talk, I will use an example to show how to use the Spark Dataset Converter to train a Tensorflow model and how simple it is to go from single-node training to distributed training on Databricks.
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
There is no doubt Kubernetes has emerged as the next generation of cloud native infrastructure to support a wide variety of distributed workloads. Apache Spark has evolved to run both Machine Learning and large scale analytics workloads. There is growing interest in running Apache Spark natively on Kubernetes. By combining the flexibility of Kubernetes and scalable data processing with Apache Spark, you can run any data and machine pipelines on this infrastructure while effectively utilizing resources at disposal.
In this talk, Rajesh Thallam and Sougata Biswas will share how to effectively run your Apache Spark applications on Google Kubernetes Engine (GKE) and Google Cloud Dataproc, orchestrate the data and machine learning pipelines with managed Apache Airflow on GKE (Google Cloud Composer). Following topics will be covered: – Understanding key traits of Apache Spark on Kubernetes- Things to know when running Apache Spark on Kubernetes such as autoscaling- Demonstrate running analytics pipelines on Apache Spark orchestrated with Apache Airflow on Kubernetes cluster.
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
Pipelines have become ubiquitous, as the need for stringing multiple functions to compose applications has gained adoption and popularity. Common pipeline abstractions such as “fit” and “transform” are even shared across divergent platforms such as Python Scikit-Learn and Apache Spark.
Scaling pipelines at the level of simple functions is desirable for many AI applications, however is not directly supported by Ray’s parallelism primitives. In this talk, Raghu will describe a pipeline abstraction that takes advantage of Ray’s compute model to efficiently scale arbitrarily complex pipeline workflows. He will demonstrate how this abstraction cleanly unifies pipeline workflows across multiple platforms such as Scikit-Learn and Spark, and achieves nearly optimal scale-out parallelism on pipelined computations.
Attendees will learn how pipelined workflows can be mapped to Ray’s compute model and how they can both unify and accelerate their pipelines with Ray.
Sawtooth Windows for Feature AggregationsDatabricks
In this talk about zipline, we will introduce a new type of windowing construct called a sawtooth window. We will describe various properties about sawtooth windows that we utilize to achieve online-offline consistency, while still maintaining high-throughput, low-read latency and tunable write latency for serving machine learning features.We will also talk about a simple deployment strategy for correcting feature drift – due operations that are not “abelian groups”, that operate over change data.
We want to present multiple anti patterns utilizing Redis in unconventional ways to get the maximum out of Apache Spark.All examples presented are tried and tested in production at Scale at Adobe. The most common integration is spark-redis which interfaces with Redis as a Dataframe backing Store or as an upstream for Structured Streaming. We deviate from the common use cases to explore where Redis can plug gaps while scaling out high throughput applications in Spark.
Niche 1 : Long Running Spark Batch Job – Dispatch New Jobs by polling a Redis Queue
· Why?
o Custom queries on top a table; We load the data once and query N times
· Why not Structured Streaming
· Working Solution using Redis
Niche 2 : Distributed Counters
· Problems with Spark Accumulators
· Utilize Redis Hashes as distributed counters
· Precautions for retries and speculative execution
· Pipelining to improve performance
Re-imagine Data Monitoring with whylogs and SparkDatabricks
In the era of microservices, decentralized ML architectures and complex data pipelines, data quality has become a bigger challenge than ever. When data is involved in complex business processes and decisions, bad data can, and will, affect the bottom line. As a result, ensuring data quality across the entire ML pipeline is both costly, and cumbersome while data monitoring is often fragmented and performed ad hoc. To address these challenges, we built whylogs, an open source standard for data logging. It is a lightweight data profiling library that enables end-to-end data profiling across the entire software stack. The library implements a language and platform agnostic approach to data quality and data monitoring. It can work with different modes of data operations, including streaming, batch and IoT data.
In this talk, we will provide an overview of the whylogs architecture, including its lightweight statistical data collection approach and various integrations. We will demonstrate how the whylogs integration with Apache Spark achieves large scale data profiling, and we will show how users can apply this integration into existing data and ML pipelines.
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
Machine learning (ML) models are typically part of prediction queries that consist of a data processing part (e.g., for joining, filtering, cleaning, featurization) and an ML part invoking one or more trained models. In this presentation, we identify significant and unexplored opportunities for optimization. To the best of our knowledge, this is the first effort to look at prediction queries holistically, optimizing across both the ML and SQL components.
We will present Raven, an end-to-end optimizer for prediction queries. Raven relies on a unified intermediate representation that captures both data processing and ML operators in a single graph structure.
This allows us to introduce optimization rules that
(i) reduce unnecessary computations by passing information between the data processing and ML operators
(ii) leverage operator transformations (e.g., turning a decision tree to a SQL expression or an equivalent neural network) to map operators to the right execution engine, and
(iii) integrate compiler techniques to take advantage of the most efficient hardware backend (e.g., CPU, GPU) for each operator.
We have implemented Raven as an extension to Spark’s Catalyst optimizer to enable the optimization of SparkSQL prediction queries. Our implementation also allows the optimization of prediction queries in SQL Server. As we will show, Raven is capable of improving prediction query performance on Apache Spark and SQL Server by up to 13.1x and 330x, respectively. For complex models, where GPU acceleration is beneficial, Raven provides up to 8x speedup compared to state-of-the-art systems. As part of the presentation, we will also give a demo showcasing Raven in action.
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
Semantic segmentation is the classification of every pixel in an image/video. The segmentation partitions a digital image into multiple objects to simplify/change the representation of the image into something that is more meaningful and easier to analyze [1][2]. The technique has a wide variety of applications ranging from perception in autonomous driving scenarios to cancer cell segmentation for medical diagnosis.
Exponential growth in the datasets that require such segmentation is driven by improvements in the accuracy and quality of the sensors generating the data extending to 3D point cloud data. This growth is further compounded by exponential advances in cloud technologies enabling the storage and compute available for such applications. The need for semantically segmented datasets is a key requirement to improve the accuracy of inference engines that are built upon them.
Streamlining the accuracy and efficiency of these systems directly affects the value of the business outcome for organizations that are developing such functionalities as a part of their AI strategy.
This presentation details workflows for labeling, preprocessing, modeling, and evaluating performance/accuracy. Scientists and engineers leverage domain-specific features/tools that support the entire workflow from labeling the ground truth, handling data from a wide variety of sources/formats, developing models and finally deploying these models. Users can scale their deployments optimally on GPU-based cloud infrastructure to build accelerated training and inference pipelines while working with big datasets. These environments are optimized for engineers to develop such functionality with ease and then scale against large datasets with Spark-based clusters on the cloud.
Massive Data Processing in Adobe Using Delta LakeDatabricks
At Adobe Experience Platform, we ingest TBs of data every day and manage PBs of data for our customers as part of the Unified Profile Offering. At the heart of this is a bunch of complex ingestion of a mix of normalized and denormalized data with various linkage scenarios power by a central Identity Linking Graph. This helps power various marketing scenarios that are activated in multiple platforms and channels like email, advertisements etc. We will go over how we built a cost effective and scalable data pipeline using Apache Spark and Delta Lake and share our experiences.
What are we storing?
Multi Source – Multi Channel Problem
Data Representation and Nested Schema Evolution
Performance Trade Offs with Various formats
Go over anti-patterns used
(String FTW)
Data Manipulation using UDFs
Writer Worries and How to Wipe them Away
Staging Tables FTW
Datalake Replication Lag Tracking
Performance Time!
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
1. Mike Ferguson
Managing Director, Intelligent Business Strategies
Data + AI Summit 2021
May 2021
Building the Artificially Intelligent Enterprise
- A Blueprint for Maximising Business Value from AI
3. 3
About Mike Ferguson
www.intelligentbusiness.biz
mferguson@intelligentbusiness.biz
@mikeferguson1
(+44) 1625 520700
Mike Ferguson is Managing Director of Intelligent Business Strategies Limited. As an
independent IT industry analyst and consultant he specialises in BI / analytics and
data management. With over 39 years of IT experience, Mike has consulted for
dozens of companies on BI/Analytics, data strategy, technology selection, enterprise
architecture, and data management. Mike is also conference chairman of Big Data
LDN, the fastest growing data and analytics conference in Europe. He has spoken at
events all over the world and written numerous articles. Formerly he was a principal
and co-founder of Codd and Date Europe Limited – the inventors of the Relational
Model, a Chief Architect at Teradata on the Teradata DBMS and European Managing
Director of Database Associates. He teaches popular master classes in Data
Warehouse Modernisation, Big Data, Enterprise Data Governance, Master Data
Management, Building, Managing and Operating an Enterprise Data Lake, Machine
Learning and Advanced Analytics, Real-time Analytics, and Data Virtualisation.
4. 4
About Intelligent Business Strategies
§ UK-based independent IT analyst and consulting firm founded 1992 specialising in
data management and analytics
§ Three main lines of business
Education
• Data Governance & MDM
• Designing, Managing and Operating an Enterprise
Data Lakes – Data lake to Data marketplace
• DW Modernisation
• DW Migration to the Cloud
• Machine Learning and Advanced Analytics
• Integrating AI into the Enterprise
• Public classes (anyone)
• On-site classes (single client)
• Customers, vendors, systems integrators
• On-line (public & on-sites)
Consulting
• Customers
• D&A Strategy, Data Architecture
• D&A Technology selection
• D&A Reviews, Data Governance
• Project advisory
• Vendors
• Product strategy
• Product positioning
• Marketing support
• Speaking at vendor events
• White papers
• Webinars
• Venture Capitalists
• Due-diligence, Asset advisory
Research
• Market research
• 4th Industrial
Revolution Survey
• D&A product research
• Data catalogs
• Data Governance
www.intelligentbusiness.biz
5. 5
Topics
§ Data and analytics – where are we?
§ Transitioning to a self-learning enterprise
• Sorting out the data foundation
• DataOps and MLOps - Component based pipeline development, automated testing and
deployment
• Data and analytics marketplace
• Integrating analytics into business processes
• Reinforcement learning, multi-level performance management and AI driven dynamic
planning
§ Conclusions
6. 6
Topics – Where Are We?
ØData and analytics – where are we?
§ Transitioning to a self-learning enterprise
• Sorting out the data foundation
• DataOps and MLOps - Component based pipeline development, automated testing and
deployment
• Data and analytics marketplace
• Integrating analytics into business processes
• Reinforcement learning, multi-level performance management and AI driven dynamic
planning
§ Conclusions
7. 7
marts
marts
marts
Data And Analytics Today - Many Companies Have Built Multiple DWs And
Marts In Different Parts Of Their Value Chain
Fore-
casting
Product,
Materials
Supplier
Master data
Planning
ERP ERP CAD
Manufacturing
execution system
Shipping
system
CRM
system
SCADA
systems
Finance DW Manufacturing
volumes &
inventory DW
Sales &
mktng DW
Financial /
Reg Reporting
& Planning
Makes management and regulatory
reporting more challenging as data
needs to be integrated to see across
the value chain
May also be the case that data is inconsistent across data warehouses
e.g. different PKs, data names and DI/DQ jobs for same data in each DW
The issue here is project related DI
8. 8
Self-service BI
& Data Preparation
Business Analyst
personal &
office data
community
Publish / Share
Consume /
Enhance /
Re-publish
Transaction
systems
Predictive
models
Finance DW
Multiple Data Warehouses Has Made Self-Service Data Preparation And
Integration The Norm For Self-Service BI Users Trying To Access Data
collaborate
Materials &
Inventory
DW
Information Overload?
Self-service data integration
supposedly improves agility BUT
at what cost?
Data complexity forced on the user
Reinvention of the wheel
No ability to share metadata
specifications with other tools
…….
9. 9
Challenges
– Ever Increasing Types Of Data That Businesses Want To Analyse
Type Of Data Examples Uses
Traditional
structured data
• Master data
• Transaction data
• Customer, product, employee, supplier, site,…..
• Orders, shipments, returns, payments, adjustments..
Machine
generated data
• Clickstream web server logs
• IVR logs, App Server logs
• DBMS logs
• On-line behaviour analysis
• Cyber security
• Consumer IoT (Sensor data)
• Industrial IoT (Sensor data)
• Location, temperature, movement,
vibration, pressure
• Product usage behaviour
• Product or equipment performance
Human
generated data
• Social network data
• Inbound email
• Competitor news feeds
• Documents
• Voice interaction data
• Unstructured text , sentiment analysis
External data • Open government data
• Weather data
• Structured data
• Semi-structured data, e.g. JSON, XML, AVRO
• Sales impact, distribution impact
10. 10
The Changing Analytical Landscape – Many Organisations Now Have Different
Platforms Optimised For Different Analytical Workloads
Streaming
data
NoSQL
DBMS
Graph
DB
Hadoop
data store
Big Data workloads result in multiple platforms now being needed for analytical processing
Cloud
storage
Real-time stream
processing & decision
management
Graph
analysis
Investigative
analysis,
Data refinery
Analytical RDBMS
EDW
DW & marts
mart
C
R
U
D
Prod
Asset
Cust
MDM
Advanced Analytics
(multi-structured data)
Machine Learning
model development
Traditional
query, reporting
& analysis
Machine / Deep Learning
model development
master data
11. 11
The Entire Analytical Ecosystem Is Now Available In The Cloud
Several vendors now offer the entire analytical ecosystem on the cloud
Alternatively it can be a hybrid setup
Cloud storage is separated from compute and can underpin
multiple analytical systems reducing copies of data
Streaming
data
Analytical RDBMS
EDW
DW & marts
Graph
DB mart
C
R
U
D
Prod
Asset
Cust
MDM
Advanced Analytics
(multi-structured data)
Cloud Storage (Data Lake)
Streaming
analytics
as-a-service
cluster
NoSQL
DBMS
Traditional
query, reporting
& analysis
Real-time stream
processing & decision
management
Graph
analysis
Investigative
analysis,
Data refinery
Machine Learning
model development
Machine / Deep Learning
model development
master data
12. 12
mart
Cloud DW
mart
Cloud storage
mart
Cloud DW DBMS
Data Warehouse Migration Is Happening in Many Enterprises
• Schema
• Data
• ETL processing and loading
• Metadata
• Users, roles, access security privileges
• Data warehouse operations jobs / scripts
• Dashboards, reports & analytical models
Existing DW
and data marts
Migrate
13. 13
Issues - Siloed Approach To Data And Analytics, With Many Tools, Scripts And
Code In Use To Clean, Transform And Integrate Data That Are Not Integrated
Analytical
tools
Data
integration
tools
EDW
mart
Structured data
CRM ERP SCM
Silo
DW & marts
Analytical
tools/apps
Data
integration
tools
Multi-structured
data
Silo
DW
Appliance
Advanced Analytics
(structured data)
Data
integration
tools
Structured data
CRM ERP SCM
Analytical
tools
Silo Silo
C
R
U
D
Prod
Asset
Cust
MDM
Applications
Data
integration
tools
Master data
management
CRM ERP SCM
Streaming
data
Analytical
models/
tools/apps
Silo
Analytical
tools/apps
Data
integration
tools
NoSQL DB
e.g. graph DB
Silo
Multi-structured &
structured data
How many tools,
scripts and programs
are in use to
clean/integrate data?
Unlikely that metadata
is shared across tools
14. 14
Issues
- A Siloed Approach Means Point-to-Point Data Integration And Re-Invention
Analytical
tools
Data
integration
tools
EDW
mart
Structured data
CRM ERP SCM
Silo
DW & marts
Analytical
tools/apps
Data
integration
tools
Multi-structured
data
Silo
DW
Appliance
Advanced Analytics
(structured data)
Data
integration
tools
Structured data
CRM ERP SCM
Analytical
tools
Silo Silo
C
R
U
D
Prod
Asset
Cust
MDM
Applications
Data
integration
tools
Master data
management
CRM ERP SCM
Streaming
data
Analytical
models/
tools/apps
Silo
Analytical
tools/apps
Data
integration
tools
NoSQL DB
e.g. graph DB
Silo
Multi-structured &
structured data
How many times is the
same data extracted
and transformed?
It happens again and
again for each
analytical system
Point-to-Point
Point-to-Point
Point-to-Point
Point-to-Point
Point-to-Point
Point-to-Point
15. 15
Today’s Digital Enterprise Is Running Applications And Storing Data In A Hybrid
Computing Environment Spanning Edge, Multiple Clouds And The Data Centre
gateway
gateway
edge
devices
Data Centre(s)
gateway
gateway
data
flow
data flow
Cloud computing
data flow
data flow
edge
devices
edge
devices
edge
devices
16. 16
Challenges – Data Is Being Ingested Into Multiple Types Of Data Store Both
On-Premises And In The Cloud
Enterprise
cloud
storage
Data.Gov
C
R
U
prod cust
asset
D
MDM
NoSQL
DBMS DW
I
D N
A G
T E
A S
T
17. 17
The Distributed Data Landscape
- Data Is Now Stored At The Edge, In Multiple Clouds And In The Data Centre
Edge gateway
Edge devices
Date Centre
Sensor Data
Data
Data Data Data
18. 18
Challenges – Finding, Managing, Governing And Integrating Data Is Becoming
Increasingly Complex As Data Sources Grow
<XML>
/ JSON
Digital media
RDBMSs
Web
content
E-mail
Flat files
Packaged
applications
Office
documents
Cloud
storage
DW/BI
systems
Big data applications
Cloud based
applications
ECMS
“Where is all the
Customer Data?”
More and more data sources now need to be integrated to provide
information for business use
Edgegateway
Edgedevices
DateCentre
SensorData
Data
Data Data Data
Distributed Data Landscape
19. 19
But With 000’s Of Data Sources, IT And Business Need To Working Together
As IT Will Likely Become A Bottleneck
IT
OLTP
systems
Web
logs
web
DQ/DI
job
DQ/DI
job
DQ/DI
job
Open data
IoT
machine data
social & web
C
R
U
prod cust
asset
D
MDM
DW
Data
warehousing
cloud
Data virtualisation
Can business analysts &
Data Scientists help?
Self-svc
data prep
???
Bottleneck?
Should IT be expected
to do everything?
Big Data
20. 20
Self-Service BI, Stand-Alone Data Science And Self-Service Data Preparation
HR
Sales
Marketing
Service
Finance
Procure
-ment
Operations Distribution
Partners
Customers
Suppliers
Employees
Things
Self service
data prep
Self service
data prep
Self service
data prep
Self service
data prep
Self service
data prep
Self service
data prep
Self service
data prep
Self service
data prep
Edgegateway
Edgedevices
DateCentre
SensorData
Data
Data Data Data
Distributed Data Landscape
21. 21
cloud storage
Customers Now Have Major Data Challenges – How Do You Govern Self-
Service Data Preparation To Avoid Chaos In The Enterprise?
social
Web
logs
web cloud
sandbox
Data Scientists
sandbox
Data Scientists
sandbox
Data Scientists
HDFS
Self-service
BI tools with data prep
new
insights
SQL on
Hadoop
Data
prep
Self-service
BI tools with data prep
Data
prep
ETL
/ DQ
ETL
DW
ETL
/ DQ DW
marts
ETL
SCM
CRM
ERP
marts
Built by IT
data
prep
Data
prep
Data
prep
Governance?
“Everyone is blindly integrating data with
no attempt to share what we create !!”
22. 22
Challenges - The Danger Of Self-Service Data Preparation
– An Explosions Of Personal Silos!
Analytical
tools
Data prep
tools
Data
store
Silo
sources
Analytical
tools
Data prep
tools
Data
store
Silo
sources
Analytical
tools
Data prep
tools
Data
store
Silo
sources
Analytical
tools
Data prep
tools
Data
store
Silo
sources
Analytical
tools
Data prep
tools
Data
store
Silo
sources
Analytical
tools
Data prep
tools
Data
store
Silo
sources
Analytical
tools
Data prep
tools
Data
store
Silo
sources
Analytical
tools
Data prep
tools
Data
store
Silo
sources
Analytical
tools
Data prep
tools
Data
store
Silo
sources
Analytical
tools
Data prep
tools
Data
store
Silo
sources
Analytical
tools
Data prep
tools
Data
store
Silo
sources
Analytical
tools
Data prep
tools
Data
store
Silo
sources
=
Garbage In Garbage Out
Inconsistent data!!
Multiple versions!!
23. 23
OR
Companies Want Organised, Findable, Trusted, Re-Usable Data Assets!
Image source: https://ebcwblog.wordpress.com/2014/10/02/how-to-decorate-with-books/ Image Source: Maughan Library, London (King's College London Library)
25. 25
BI And AI Usage Is Primarily Happening At The Tactical Level With Growing
Use In Operations But It Is Not Tied Together To Contribute To Common Goals
Executive
Middle & Operations
Managers/
Bus. Analysts
Operations
Staff
15%
70%
20%
X
X
Lack of integration and
alignment on common
business goals
Departmental and cross-domain
KPIs reports, dashboards &
some predictions & alerts
domain-specific
analytics/reports,
Some ML models, alerts,
recommendations & very
little RPA
Planning & Scorecards
(a lot of companies are
still using Excel here)
80%
X
X
60%
26. 26
Where Are We On Data And Analytics? – It is Not Just Build, It’s About Usage
§ Focus has been on development which is currently fractured and lacking a trusted data
foundation
§ We need to industrialise and speed up the build of data and analytical assets
• Fix the data foundation
• Create a data and analytics factory and speed up the building data and analytical assets
• Automatic generation of pipelines using the data catalog and metadata
• Augmented data governance and data preparation, autoML, DataOps and MLOps to speed up
development with CI/CD for automated build, test and deploy
§ 2021 and beyond is the era of usage
• Data and analytics marketplace
• Align data and analytical assets with business strategy
• Mobilise the masses to integrate AI into business processes to drive value via low-code / no-code
• Introduce on-demand and event driven analytics
• Create an enterprise action framework
• Alert, recommend, and automate with reinforcement learning to continuously improve
27. 27
Topics – Where Are We?
§ Data and analytics – where are we?
ØTransitioning to a self-learning enterprise
ØSorting out the data foundation
ØDataOps and MLOps - Component based pipeline development, automated testing and
deployment
• Data and analytics marketplace
• Integrating analytics into business processes
• Reinforcement learning, multi-level performance management and AI driven dynamic
planning
§ Conclusions
29. 29
Enterprise Data Fabric Software
Key Requirements – We Need A Data Catalog To Automatically Discovery What
Data Is Available, Its Quality, Sensitivity And Where It Is Across The Landscape
Data
catalog
gateway
Edge devices
Date Centre
Sensor Data
Data
Data Data Data
Automatic data discovery (crawl)
Automatic discovery, automatic mapping to a common vocabulary in a business glossary
30. 30
Key Technology Requirements – Need Data Fabric Software To Connect To,
Govern & Integrate Data Across Edge, Multiple Clouds And Data Centre
Enterprise Data Fabric Software (Auto Generated D&A Pipelines)
Data
catalog
gateway
Edge devices
Date Centre
Data Fabric software helps avoid or reduce the chances of data silos
Data Discovery, Profiling, Semantic Tagging, Data Catalog, Data Governance, Data Preparation / integration, APIs, MDM
Data
Data
Data Data Data
It should be
possible to
automatically
generate data
& analytics
pipelines from
the metadata
mappings of
sources to the
business
glossary already
in the catalog
31. 31
Create A Data Lake And Information Supply Chain To Curate ‘Business Ready’
Data And Analytical Assets Published In A Marketplace For Users To Consume
IoT
RDBMS
office docs
social
Cloud
clickstream
web logs
XML,
JSON
web
services
NoSQL
Files
information
consumers access
the data
marketplace to
shop for business
ready data and
analytical assets
shop
for
data
Info
Catalog
Data&
Analytics
marketplace
Curation processes – CI/CD DataOps pipelines
Project
Business ready
data assets
Data Fabric ELT Processing
Information supply chain
Ingestion zone Curation zone Trusted zone
(common vocabulary)
32. 32
Trusted Business Ready Data In An Enterprise Data Marketplace For Users To
Consume And Use
Data available as a Service
Master Data
• Customers
• Products
• Suppliers
• Assets
• Employees
• Materials
Transaction Data
• Orders
• Shipments
• Payments
• Adjustments
• Returns
Business ready data
products are often
logical entities
Build once, reuse everywhere
33. 33
What Is DataOps?
- Continuous Collaborative Data Curation, Testing And Deployment
§ DataOps applies the use of DevOps to the
development of data and analytical pipelines to
produce trusted, integrated data and analytical assets
• Data curation pipelines
• BI Reports, dashboards and stories
• Predictive models
• Prescriptive models / decision services
§ The objective is to accelerate the creation of trusted
data and analytical assets via:
• Continuous component based development
– Data ingestion, cleansing, transformation, matching
and integration services
• Increased reuse of component-based services in pipelines
• Deployment automation
High value trusted
data asset
and /or insights
available for
consumption
Raw
data
Raw
data
Trusted
data
DataOps
34. 34
DataOps Data And Analytics Pipelines Should Follow A Modular Design To
Enable Component Based Development And Orchestration
§ The pipeline is broken into smaller separately executable components for each distinct unit of work
§ Each component can be invoked as a service
§ Each component may itself be a mini pipeline
component component component component component
task task task task task task
Pipeline execution orchestration
Pipeline orchestration manages the component execution, while the components do the actual work
gateway
Edge devices
Date Centre
Sensor Data
Data
Data Data Data
Data
product
(asset)
∑∫(x) Analytical
product
(asset)
Data & Analytical Pipeline
35. 35
Types Of Components In A DataOps Data Analytics Pipeline
Single task
component
component
task task task
Mini-flow
component
task task
Mini-flow
Type of Component Examples
Data ingestion components
• File ingestion
• Database table ingestion service
• Stream ingestion service
Data transportation components
Data governance components
• Data validation services
• Data cleansing services
• e.g. Address cleansing / enrichment
• Data privacy masking service
• Logging and auditing services
Data transformation components
Data matching and integration
components
Analytical components
• Voice-to-text conversion
• Customer segmentation clustering service
• Customer sentiment scoring model
• Customer propensity to churn scoring model
Data loading components
Action components • Alerts, recommendations, automation,….
36. 36
DataOps – Component Based Development Needs A Common Version Control
System Irrespective Of Single Data Fabric Or Best-of-Breed Tools Being Used
Information
Orchestration
component component component component component
task task task task task task
Data & Analytical Pipeline
Version Control
Each component of the
pipeline is a new,
independent branch.
Components are merged
into the main branch as they
are completed.
Branch and merge enables
collaborative development with
different people working on different
components
= Test e.g. row counts, data error checks,
comparisons, performance
= Container
= Run-time configuration
37. 37
Getting The Foundation Right By Building Trusted Data Assets
- From Data Lake To Data Marketplace
IoT
RDBMS
office docs
social
Cloud
clickstream
web logs
XML,
JSON
web
services
NoSQL
Files
Data
Ingestion
Data
Curation
/
enrichment
Trusted data
assets
DW
Data Curation process
customer
product
orders
Raw
data
Raw
data
shipments
payments
Ready made
data products
Data
Virtualisation
Data science
Application
Trusted virtual
data assets
Landing
zone
Raw
data
Raw
data
Trusted zone
Stream
processing
BI tool
Data Lake
BI tool
publish
Graph
DB
provision
provision
provision
provision
Data
Marketplace
(catalog)
38. 38
Topics – Where Are We?
§ Data and analytics – where are we?
ØTransitioning to a self-learning enterprise
• Sorting out the data foundation
• DataOps and MLOps - Component based pipeline development, automated testing and
deployment
ØData and analytics marketplace
• Integrating analytics into business processes
• Reinforcement learning, multi-level performance management and AI driven dynamic
planning
§ Getting started
39. 39
What Is An Enterprise Data And Analytics Marketplace?
Enterprise Data & Analytics
Marketplace
A catalog containing ready made,
trusted, data and analytical assets
available as services with common
data names documented in a business
glossary, full metadata lineage and that
are tagged and organised to make
them easy to find, access, share and
reuse across the enterprise
40. 40
A Data & Analytics Marketplace Should Have Search, Faceted Search and a
Shopping Cart Similar To That In E-Commerce Web-Sites (e.g. Amazon)
Add it to
your cart
Select the
products
you want
Product Examples:
Informarica Axon Data Marketplace,
Collibra, Zaloni
41. 41
Reducing Time To Value
– Shop For Trusted Ready-Made Data And Deliver Value Rapidly
information
consumers access
the data
marketplace to
shop for ready-to-
go data and
analytical assets
shop
for
data
Info
Catalog
Data
marketplace
Trusted data
service
Query service
BI report /
dashboard /
story
BI Insights pipeline
Trusted data
service
Analytical
service
Predictive insights pipeline (rapid assembly)
Trusted data
service
Analytical
service
Decision
service
Prescriptive analytical pipeline (rapid assembly)
BI report /
dashboard /
story
Trusted data
service
New virtual
data service
Enrich data
Trusted data
service
42. 42
Data Marketplace Operations – Information Consumers Can Enrich Data And
Create New Insights To Also Publish In The Marketplace
information
consumers access
the data
marketplace to
shop for ready-to-
go data and
analytical assets
shop
for
data
Info
Catalog
Data
marketplace
Trusted data
service
Query service
BI report /
dashboard /
story
BI Insights pipeline
Trusted data
service
Analytical
service
Predictive insights pipeline (rapid assembly)
Trusted data
service
Analytical
service
Decision
service
Prescriptive analytical pipeline (rapid assembly)
BI report /
dashboard /
story
Trusted data
service
New virtual
data service
Enrich data
Trusted data
service
publish newly created assets back into the catalog
43. 43
Topics – Where Are We?
§ Data and analytics – where are we?
ØTransitioning to a self-learning enterprise
• Sorting out the data foundation
• DataOps and MLOps - Component based pipeline development, automated testing and
deployment
• Data and analytics marketplace
ØIntegrating analytics into business processes
• Reinforcement learning, multi-level performance management and AI driven dynamic
planning
§ Conclusions
44. 44
Intelligent Business Requires BI, Analytics And AI To Be Integrated Into
Processes To Help Empower Everyone
Business Processes
+ =
Self-Learning Artificially
Intelligent Business
Integrated
BI & AI Services
Mobile apps, web apps
office portal / collab workspaces (e.g. Teams)
Office automation,
Business process management
Process and Application integration
REST APIs, iPaaS / Enterprise Service Bus
Common Vocabulary
Active Contribution-Based CPM,
Real-time analytics,
Automated alerts and recommendations
On-Demand & Event Driven Analytics & BI
Intelligent Process Behaviour
Automated Actions (RPA)
Self-learning Business
Common Vocabulary
Data Governance services
Data & Analytics Asset Marketplace
On-demand & RT BI & AI Services
Data Assets as a Service
Reinforcement learning services
Multi-level Corporate Performance Mgm’t
Data Quality / Data Integration services
45. 45
Decisions Need To Be Made Using Trusted Data And Analytics
HR
Sales
Marketing
Service
Finance
Procurement
Operations Distribution
Partners
Customers
Suppliers
Employees
Operational decisions (thousands)
• Tactical decisions (hundreds)
• Escalated operational decisions
• Set Business Strategy – objectives,
targets & priorities
• Strategic decisions (tens)
• Escalated critical operational decisions
Trusted Data Assets
Data &
Analytics
marketplace
(catalog)
46. 46
Trusted Data Assets
We Want Is Trusted Data And Analytical Assets Available As A Service For
Reused Everywhere In A Data Driven-Enterprise
Trusted Data,
Analytics
& Decision
Services
HR
Sales
Marketing
Service
Finance
Procure
-ment
Operations Distribution
Partners
Customers
Suppliers
Employees
Things
The Intelligent
Business
Commonly understood, trusted
data, and analytical services
available across the enterprise
All trusted data is described using a
common vocabulary and ontology
HR
Sales
M arketing
Service
Finance
Procurem ent
O perations Distribution
D&A
marketplace
(catalog)
47. 47
Related Data And Analytical Services Need To Be Co-Ordinated To Maximise
Business Impact Of Decisions Across Towards Common Goals
HR
Sales
Marketing
Service
Finance
Procurement
Operations Distribution
Partners
Customers
Suppliers
Employees
Operational decisions (thousands)
• Tactical decisions (hundreds)
• Escalated operational decisions
• Set Business Strategy – objectives,
targets & priorities
• Strategic decisions (tens)
• Escalated critical operational decisions
Trusted Data Assets
Data assets, BI reports, models, alerts,
recommendations and automated actions all need
to be classified by business goal to know:
• What data and analytical assets align with
what business goals
• How they work together to contribute towards
achieving those goals
• How decision effectiveness and contribution
can be measured at all levels to see the
related decisions are having an impact
• What decisions have the greatest impact
48. 48
Customers Need To Understand Where and At What Levels Analytics Can Be
Deployed To Guide, And Automate To Enable Mass Contribution To Objectives
Marketplace assets
classified by objective
• Data assets
• BI assets (reports, dashboards)
• On-demand & event driven
predictive assets
• On-demand & event driven
prescriptive assets
• Auto alerting services,
• Recommendation services
• RPA services
Business
Strategy
Strategic
objectives
Strategic decisions
(tens)
Tactical decisions
(hundreds)
Operational
decisions
(thousands)
Data & Analytics
marketplace
(catalog)
CPM/Planning
AI Integration – one approach does NOT fit all
Who / what needs which asset?
How should it be integrated
to achieve the objective?
49. 49
Need To Integrate Insights, AI And Automation Into Business Process Activities
To Help Achieve Business Objectives During Process Execution
Order Entry, Fulfilment and Tracking Process
How can insights, recommendations and
automation be leveraged to help improve business
performance in specific process activities?
• E.g. Robotic Process Automation (RPA)
Insights, alerts,
recommendations or
automation
Which process activities are performed?
• Automatically by applications / software?
• Manually by people?
• By people using operational apps?
• By people using mobile apps?
50. 50
Trusted Data Assets
We Need To Mobilise the Masses To Integrate Data And AI Services Into
Processes Using A Low Code / No Code Approach (Citizen Developers)
Trusted Data,
Analytics
& Decision
Services
HR
Sales
Marketing
Service
Finance
Procure
-ment
Operations Distribution
Partners
Customers
Suppliers
Employees
Things
The Intelligent
Business
Commonly understood, trusted
data, and analytical services
available across the enterprise
HR
Sales
M arketing
Service
Finance
Procurem ent
O perations Distribution
D&A
marketplace
(catalog)
51. 51
Right Time Business Optimisation Means Monitoring The Pulse Of Business
Operations – Looking For Event Patterns (Business Conditions) Needing Action
§ The event-driven enterprise where every transaction and event is monitored
§ Events need to be captured and analysed to automatically detect business conditions
that are acted upon in time to keep the business optimised
§ We must monitor the pulse of business as it happens
Changed order
Cancelled order important
customer
Defaulted loan payment
Sales Vs inventory
Late delivery Shipment delay
Overdue payment
52. 52
Layers Of AI Agents Automatically Monitoring The Business At Different Levels
To Ensure Contribution To Common Business Goals For Greatest Reward
monitoring
agent
monitoring
agent
monitoring
agent
monitoring
agent
events
events
monitoring
agent
monitoring
agent
monitoring
agent
events
Executive
Operations staff
Managers
Multiple agents aligned to
common objectives
monitoring
agent
monitoring
agent
The next frontier is continuous observability PLUS
reinforcement learning to grow the reward
R
e
i
n
f
o
r
c
e
m
e
n
t
l
e
a
r
n
i
n
g
b
a
s
e
d
r
e
c
o
m
m
e
n
d
a
t
i
o
n
&
a
c
t
i
o
n
s
53. 53
Topics – Where Are We?
§ Data and analytics – where are we?
ØTransitioning to a self-learning enterprise
• Sorting out the data foundation
• DataOps and MLOps - Component based pipeline development, automated testing and
deployment
• Data and analytics marketplace
• Integrating analytics into business processes
ØMulti-level performance management and AI driven dynamic planning
§ Conclusions
54. 54
It Is Not Just About Analytics - Planning Needs To Span BI/Analytics And Business Processes
For Continuous Monitoring, Dynamic Planning, AI-Driven Resource And Process Optimisation
Executive
Operations staff
Managers
The next frontier is continuous monitoring of performance
Vs objectives with data-driven AI assisted dynamic
planning and resource allocation
App App App
Operational Apps
Operational Business Processes
Sales
M arketing
Service
Finance
Procurem ent
O perations Distribution
Analytical systems
Continuous Reinforcement Learning
based Performance Management, Dynamic
Planning & Auto Resource Allocation PLUS
dynamic process optimisation
Streaming data
Data feeds
process events
DW
Graph
Automate
/ optimise
Planning
Planning
Planning
Planning
act
55. 55
Topics – Where Are We?
§ Data and analytics – where are we?
§ Transitioning to a self-learning enterprise
• Sorting out the data foundation
• DataOps and MLOps - Component based pipeline development, automated testing and
deployment
• Data and analytics marketplace
• Integrating analytics into business processes
• Multi-level performance management and AI driven dynamic planning
ØConclusions
56. 56
Intelligent Business Strategies Architecture For The Artificially Intelligent
Business – From BI To Data-Driven Artificially Intelligent Business
Data, analytical, decision
and reinforcement learning
services guiding everyone
in every business process
to contribute to meeting
common strategic goals
Partners &
customers
Suppliers
Intelligent Operations
My Objectives
My Business activities
(process tasks)
My Reports
My KPIs
My Alerts
My Recommendations
My Actions
My Team
My Contribution to biz goals
My Communities
Artificially
Intelligent
Sales
Artificially
Intelligent
Procurement
Artificially
Intelligent
Service
Artificially
Intelligent
Risk M’gmt
Bus. Processes Orchestration /RPA
Artificially
Intelligent HR
Key Performance Indicators
Mobile Apps
Artificially
Intelligent
Marketing
Artificially
Intelligent
Finance
Artificially
Intelligent
Front Office
Artificially
Intelligent
Back Office
Artificially Intelligent Operations & Risk
Artificially
Intelligent
operations
Event Driven
ESB/iPaaS/APIs
Common Vocabulary, Catalog & Integration Platform
Web Apps
Single Sign-On
Multi-level AI (RL) Driven Dynamic Planning, Resource & Process Optimisation
employees suppliers
partners
Teams / SharePoint
Data, BI services
Predictive
Analytics, Decision
Services & RL
Trusted Data
Edge Data Centre Multiple Clouds
D&A asset
marketplace
(catalog)
customers
Data Fabric
57. 57
Conclusions
- Software Requirements For “Always On” Artificially Intelligent Business Optimisation
§ Data catalog and data fabric
• Common shared business vocabulary based on common data names and common data definitions
• Cross referencing and mapping of disparate data definitions to common definitions
• Metadata lineage to prove how metrics are calculated, i.e. TRUSTED metrics
§ Automated generation of scalable dynamic data pipelines
§ Corporate performance management / planning integrated with analytical assets
§ Corporate performance management integrated with business process management
§ Continuous monitoring of events that occur in business process operations including support for:
• Automatic event driven data integration
• Automatic scoring and analysis
• Automatic decision making (prescriptive analytics)
§ Automated enterprise alerting, on-demand recommendations, guided analysis, guided and
automated actions
§ Integration with collaboration tools to share insights, recommendations, and decisions with other
people across the enterprise and beyond