What Happened of Note
in 1H 2020 in Enterprise
Advanced Analytics
Presented by: William McKnight
President, McKnight Consulting Group
williammcknight
www.mcknightcg.com
(214) 514-1444
#AdvAnalytics
William McKnight
President, McKnight Consulting Group
• Frequent keynote speaker and trainer internationally
• Consulted to Pfizer, Scotiabank, Fidelity, TD Ameritrade, Teva
Pharmaceuticals, Verizon, and many other Global 1000
companies
• Hundreds of articles, blogs and white papers in publication
• Focused on delivering business value and solving business
problems utilizing proven, streamlined approaches to
information management
• Former Database Engineer, Fortune 50 Information Technology
executive and Ernst&Young Entrepreneur of Year Finalist
• Owner/consultant: 2018 and 2017 Inc. 5000 strategy &
implementation consulting firm
• 30 years of information management and DBMS experience
2
McKnight Consulting Group Offerings
Strategy
Training
Strategy
 Trusted Advisor
 Action Plans
 Roadmaps
 Tool Selections
 Program Management
Training
 Classes
 Workshops
Implementation
 Data/Data Warehousing/Business
Intelligence/Analytics
 Master Data Management
 Governance/Quality
 Big Data
Implementation
3
COVID-19
4
Maslow’s Hierarchy of Needs
5
COVID-19
• Impacted worldwide operations
• Sudden, accelerated disruption
• No discontinuity event like this
• Impacts customer operations
• Impacts health & wellbeing
• Need to step back and plan
6
WFH Pros/Cons
• Lost personal touch
• Some feel more connected; meeting family, pets
• Life has slowed down, less “I don’t have time”
– Doing things never had time to do before: upgrades,
maintenance, processes, documentation, learning
• Working earlier, later
• Virtual hiring
• Remote conferences
7
COVID-19 impacts Data Protection
• Security concerns: people working in houses
• High-speed access issues
• Zoom…. Was great, then security concern
• Sharing confidential info
• Reconsidering tooling, balancing familiarity with
security
8
Preparing Offices for Return
• Different geographical situations
• Social distancing parameters
• Limited population, desks, stockpile sanitizer,
cleaning
• BUT distance is working. Surprise! Some % will
stay offsite
– Or multiple people to 1 seat arrangements
– Some projects done all remote
9
Keeping Focus
• Keep focus on how do the customers respond?
• Remove pressure from salesforces
• Help customers survive
• Resilient companies will come out ahead
10
Those Who Are Less Impacted
• Cloud-First
• Microservices-Based
• Data is a separate function
• Agile Development
• Master Data
11
Specific Events
12
Consortiums
• The COVID-19 High Performance Computing
Consortium
– Bringing together the Federal government, industry, and academic leaders to
provide access to the world’s most powerful high-performance computing
resources in support of COVID-19 research.
• Open Community
• 30+ Members
• 400+ Petaflops
• 100k+ Nodes
• 50+ Projects
Healthcare
• Microsoft + JAX labs for Healthcare AI
– Genomic medicine researchers at the laboratory
have been using artificial intelligence to help
manage the vast amount of research data needed
to power its precision oncology initiatives
• Virtual visits
• Tele-health
14
Cybersecurity
• Companies placed big bets on securing applications and
unmanaged IoT devices as well as risk and compliance in
the first half of 2020
• Amazon Web Services purchased cybersecurity software
company Sqrrl
– Advanced threat hunting capabilities were expected to align
well with Amazon GuardDuty
– Sqrrl analyzes big data to hunt cyberthreats, helping
companies identify and address them faster
– Utilizes linked data, machine learning, user and entity
behavior analysis, risk scoring, and big data technologies to
uncover malicious patterns and anomalies hidden within
security data sets
15
Transportation Technology
• Amazon acquires auto vendor Zoox
• Experts predict that Amazon will focus more on
integrating the technology into its distribution
network than building a fleet of cars.
16
In the Data Enterprise
17
Trends to Continue 2H20
• Graph Solutions
• Data Visualization
• Stream Processing
• Artificial Intelligence
18
Hot Projects
• Fraud Detection
• Supply Chain Optimization
• Preventive Maintenance
• Customer Churn
19
AI is disruptive
Data is the Foundation
Data’s New Highest Use is Training AI Algorithms
Data Lakes
• The Rise of the LakeHouse
• Explosion in Sensor-Based Time-Series Data and
Edge AI
• Leveraging Cloud Storage for Data Lakes
• Data Integration Automation
• Retaining structure in structured data
• Data quality additions
21
Realization that full BOB has a price
• Piecemeal architecture with a variety of tools from
a number of vendors
• If an organization desires to build their modern
data ecosystem using “best-of-breed” solutions,
the overwhelming challenges will be
interoperability, cost, and complexity—not to
mention time-to-value
– It is not all bad, as there are some interoperability
beacons of hope
• Understanding, predicting and managing costs is
difficult
• Complexity of the architecture
22
New Technology Stacks
23
Modern Platform Examples
Single Platform
example
Single Cloud –
Azure
Single Cloud –
AWS
Multi-vendor
example
Data Engineering CDP Data Hub Azure HDInsight
Amazon Elastic
Map Reduce
(EMR)
Qubole
Data Analytics
CDP Data
Warehouse
Azure Synapse Amazon Redshift Snowflake
Data Science
Cloudera Machine
Learning
Azure Machine
Learning
Amazon
SageMaker
Databricks
Data Catalog CDP Data Catalog
Azure Data
Catalog
AWS Glue Data
Catalog
Alation
Overarching
Workload
Management
CDP Workload
Manager
None None2 Not applicable
Data Movement
CDP Data
Replication
Azure Data
Factory (ADF)
AWS Glue or
Data Pipeline
Talend
Overarching
Deployment
CDP Management
Console
Azure Portal AWS Portal Not applicable1
Overarching Security CDP/SDX
Azure Active
Directory
Identity Access
Management
(IAM)
Not applicable1
24
2 A multi-vendor approach lacks a single overarching workload management, deployment, and security mechanisms. Each individual vendor
product will likely have its own means to deliver these features.
1 Not overarching—only individual applications have their own workload management features
MLOps
• MLOps applies DevOps principles to ML delivery
• The ML process primarily revolves around creating, training and deploying
models
• Once trained and validated, models are deployed into an architecture that
can deal with large quantities of (often streamed) data, to enable insights to
be derived
• Development of such models can benefit from an iterative approach, so the
domain can be better understood, and the models improved
• It also then needs a highly automated pipeline of tools, repositories to store
and keep track of models, code, data lineage and a target environment
which can be deployed into at speed
• The result is an ML-enabled application: MLOps requires data scientists to
work alongside developers, and can therefore be seen as an extension of
DevOps to encompass the data and models used for ML
25
Data Team Dynamics
• Business departments have clearly staked a claim in building their
architectures
– Still need dedicated technology professionals to do the work
– The notion of an "IT professional" is alive and well
– The reporting structure is more complicated than ever
• Acknowledgement of the need for data deployments to be near the
business unit in organization charts
• Strategists and implementors are seeing a reduction in the challenges
posed by internal grist and resistance to change
– Dependence on certain individuals is lessened with the cloud, and
many are declaring their organization unshackled from resistance
to progress
– Acceleration of acceptance and some challenging personnel
moments inside the data apparatus in organizations
26
Second Thursday of Every
Month, at 2:00 ET
Presented by: William McKnight
President, McKnight Consulting Group
www.mcknightcg.com (214) 514-1444
#AdvAnalytics

ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics

  • 1.
    What Happened ofNote in 1H 2020 in Enterprise Advanced Analytics Presented by: William McKnight President, McKnight Consulting Group williammcknight www.mcknightcg.com (214) 514-1444 #AdvAnalytics
  • 2.
    William McKnight President, McKnightConsulting Group • Frequent keynote speaker and trainer internationally • Consulted to Pfizer, Scotiabank, Fidelity, TD Ameritrade, Teva Pharmaceuticals, Verizon, and many other Global 1000 companies • Hundreds of articles, blogs and white papers in publication • Focused on delivering business value and solving business problems utilizing proven, streamlined approaches to information management • Former Database Engineer, Fortune 50 Information Technology executive and Ernst&Young Entrepreneur of Year Finalist • Owner/consultant: 2018 and 2017 Inc. 5000 strategy & implementation consulting firm • 30 years of information management and DBMS experience 2
  • 3.
    McKnight Consulting GroupOfferings Strategy Training Strategy  Trusted Advisor  Action Plans  Roadmaps  Tool Selections  Program Management Training  Classes  Workshops Implementation  Data/Data Warehousing/Business Intelligence/Analytics  Master Data Management  Governance/Quality  Big Data Implementation 3
  • 4.
  • 5.
  • 6.
    COVID-19 • Impacted worldwideoperations • Sudden, accelerated disruption • No discontinuity event like this • Impacts customer operations • Impacts health & wellbeing • Need to step back and plan 6
  • 7.
    WFH Pros/Cons • Lostpersonal touch • Some feel more connected; meeting family, pets • Life has slowed down, less “I don’t have time” – Doing things never had time to do before: upgrades, maintenance, processes, documentation, learning • Working earlier, later • Virtual hiring • Remote conferences 7
  • 8.
    COVID-19 impacts DataProtection • Security concerns: people working in houses • High-speed access issues • Zoom…. Was great, then security concern • Sharing confidential info • Reconsidering tooling, balancing familiarity with security 8
  • 9.
    Preparing Offices forReturn • Different geographical situations • Social distancing parameters • Limited population, desks, stockpile sanitizer, cleaning • BUT distance is working. Surprise! Some % will stay offsite – Or multiple people to 1 seat arrangements – Some projects done all remote 9
  • 10.
    Keeping Focus • Keepfocus on how do the customers respond? • Remove pressure from salesforces • Help customers survive • Resilient companies will come out ahead 10
  • 11.
    Those Who AreLess Impacted • Cloud-First • Microservices-Based • Data is a separate function • Agile Development • Master Data 11
  • 12.
  • 13.
    Consortiums • The COVID-19High Performance Computing Consortium – Bringing together the Federal government, industry, and academic leaders to provide access to the world’s most powerful high-performance computing resources in support of COVID-19 research. • Open Community • 30+ Members • 400+ Petaflops • 100k+ Nodes • 50+ Projects
  • 14.
    Healthcare • Microsoft +JAX labs for Healthcare AI – Genomic medicine researchers at the laboratory have been using artificial intelligence to help manage the vast amount of research data needed to power its precision oncology initiatives • Virtual visits • Tele-health 14
  • 15.
    Cybersecurity • Companies placedbig bets on securing applications and unmanaged IoT devices as well as risk and compliance in the first half of 2020 • Amazon Web Services purchased cybersecurity software company Sqrrl – Advanced threat hunting capabilities were expected to align well with Amazon GuardDuty – Sqrrl analyzes big data to hunt cyberthreats, helping companies identify and address them faster – Utilizes linked data, machine learning, user and entity behavior analysis, risk scoring, and big data technologies to uncover malicious patterns and anomalies hidden within security data sets 15
  • 16.
    Transportation Technology • Amazonacquires auto vendor Zoox • Experts predict that Amazon will focus more on integrating the technology into its distribution network than building a fleet of cars. 16
  • 17.
    In the DataEnterprise 17
  • 18.
    Trends to Continue2H20 • Graph Solutions • Data Visualization • Stream Processing • Artificial Intelligence 18
  • 19.
    Hot Projects • FraudDetection • Supply Chain Optimization • Preventive Maintenance • Customer Churn 19
  • 20.
    AI is disruptive Datais the Foundation Data’s New Highest Use is Training AI Algorithms
  • 21.
    Data Lakes • TheRise of the LakeHouse • Explosion in Sensor-Based Time-Series Data and Edge AI • Leveraging Cloud Storage for Data Lakes • Data Integration Automation • Retaining structure in structured data • Data quality additions 21
  • 22.
    Realization that fullBOB has a price • Piecemeal architecture with a variety of tools from a number of vendors • If an organization desires to build their modern data ecosystem using “best-of-breed” solutions, the overwhelming challenges will be interoperability, cost, and complexity—not to mention time-to-value – It is not all bad, as there are some interoperability beacons of hope • Understanding, predicting and managing costs is difficult • Complexity of the architecture 22
  • 23.
  • 24.
    Modern Platform Examples SinglePlatform example Single Cloud – Azure Single Cloud – AWS Multi-vendor example Data Engineering CDP Data Hub Azure HDInsight Amazon Elastic Map Reduce (EMR) Qubole Data Analytics CDP Data Warehouse Azure Synapse Amazon Redshift Snowflake Data Science Cloudera Machine Learning Azure Machine Learning Amazon SageMaker Databricks Data Catalog CDP Data Catalog Azure Data Catalog AWS Glue Data Catalog Alation Overarching Workload Management CDP Workload Manager None None2 Not applicable Data Movement CDP Data Replication Azure Data Factory (ADF) AWS Glue or Data Pipeline Talend Overarching Deployment CDP Management Console Azure Portal AWS Portal Not applicable1 Overarching Security CDP/SDX Azure Active Directory Identity Access Management (IAM) Not applicable1 24 2 A multi-vendor approach lacks a single overarching workload management, deployment, and security mechanisms. Each individual vendor product will likely have its own means to deliver these features. 1 Not overarching—only individual applications have their own workload management features
  • 25.
    MLOps • MLOps appliesDevOps principles to ML delivery • The ML process primarily revolves around creating, training and deploying models • Once trained and validated, models are deployed into an architecture that can deal with large quantities of (often streamed) data, to enable insights to be derived • Development of such models can benefit from an iterative approach, so the domain can be better understood, and the models improved • It also then needs a highly automated pipeline of tools, repositories to store and keep track of models, code, data lineage and a target environment which can be deployed into at speed • The result is an ML-enabled application: MLOps requires data scientists to work alongside developers, and can therefore be seen as an extension of DevOps to encompass the data and models used for ML 25
  • 26.
    Data Team Dynamics •Business departments have clearly staked a claim in building their architectures – Still need dedicated technology professionals to do the work – The notion of an "IT professional" is alive and well – The reporting structure is more complicated than ever • Acknowledgement of the need for data deployments to be near the business unit in organization charts • Strategists and implementors are seeing a reduction in the challenges posed by internal grist and resistance to change – Dependence on certain individuals is lessened with the cloud, and many are declaring their organization unshackled from resistance to progress – Acceleration of acceptance and some challenging personnel moments inside the data apparatus in organizations 26
  • 27.
    Second Thursday ofEvery Month, at 2:00 ET Presented by: William McKnight President, McKnight Consulting Group www.mcknightcg.com (214) 514-1444 #AdvAnalytics