ADV Slides: 2021 Trends in Enterprise Analytics

2021 Trends in Enterprise
Advanced Analytics
Presented by: William McKnight
“#1 Global Influencer in Data Warehousing” Onalytica
President, McKnight Consulting Group
A 2 time Inc. 5000 Company
@williammcknight
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET
#AdvAnalytics

Dataversity Webcast
Vertica and Pure Storage address your variable workloads
on-premises with a cloud-optimized architecture
Jeff Healey
Sr. Director of Vertica Marketing
E: jeff.a.healey@vertica.com
Miroslav Klivansky
Field Solution Evangelist
mklivansky@purestorage.com

What is Vertica?
SQL Database
Load and store data in a data
warehouse designed for
blazingly fast analytics
Query Engine
Ask complex analytical
questions and get fast
answers regardless of
where the data resides
Vertica is the leading unified analytics warehouse built for the scale and complexity of today’s data-
driven world. It combines the power of a high-performance, MPP query engine with advanced
analytics and Machine Learning.
Analytics & ML
Create, train, and deploy advanced
analytics and machine learning
models at massive scale

Remove scale, performance, and capacity constraints
3
Get data quickly enough to act upon it, explore your data interactively,
and enable everyone to make their own data-driven decisions
Fear of more users or growing data volumes is a thing of the past
Scale Data Volumes Scale Users
SQL Database
+
Vertica Analytics Platform
+
Get data quickly enough to act upon it, explore your data interactively,
and enable everyone to make their own data-driven decisions
Analytics & ML Query Engine

Benefits of Vertica in Eon Mode
Deliver Vertica with the
Cloud Economics Promise
Consuming only what you
need when you need it
Through separation of
compute from storage.
Scale Infrastructure Linearly. Elastically
scale your analytics for workload changes,
seasonality, or peak load times.
Improved Database Operations. Faster node
recovery, superior workload balancing, and
more rapid compute provisioning.
Isolate Analytic Workloads. Designate
specific nodes as a subcluster to isolate
workloads and support multi-tenancy.
Hibernate. Stop and start analytics more
efficiently by hibernating compute nodes
when they’re not needed..

Vertica powers the applications and services that enable our data-driven world.
A Day in the Life with Vertica

Learn More - Vertica in Eon Mode for Pure Storage
Visit www.vertica.com/pure today

Vertica Eon Mode Requirements
Data Safety
Performance at Scale +
Capacity for Data Growth +
Linear Scalability +
Tuned for Everything +
Easy to Manage +
The image part with relationship ID rId9 was not found in the file.
The image part with relationship ID rId11 was not found in the file.
Separation of Compute from Storage =

FLASHBLADEPURPOSE-BUILT FOR MODERN ANALYTICS
BLADE PURITY SCALE-OUT FABRIC
Powerful, Elastic Data
Processing & Storage Unit
Massively Distributed Software
for Limitless Scale
Software-defined fabric that scales
linearly with more data & clients
3

BORN FOR UNSTRUCTURED DATA
FLASHBLADE WAS BUILT TO ADDRESS MODERN DATA CHALLENGES
1980 20202005 20152000 20101990
GPS/GIS
1983
NFS
1985
WWW
1989
LDAP, Wikis,
Java and IPv6
1995
MP3
1996
Machine Learning recognizes
cats
2012
iPhone
2007
Edge computing widely
adopted
2018
1st SSD
ships
1991 Era of Analytics
2005
Hadoop
2005
S3
2006
bitcoin
2009
Nest
Thermostat
2011
Dropbox
2007
AWS
2002
Self Driving
Cars
2018
Amazon Echo
2015
Kubernetes
2015
IoT
1999
LinkedIn
2003
First human
genome sequence
completed
2003

FLASHBLADE CHASSIS: FRONT
4 RACK UNITS, UP TO 15 BLADES
© 2016 PURE STORAGE INC.
11

FLASHBLADE CHASSIS: REAR
2 FABRIC MODULES, 4 1600W POWER SUPPLIES
© 2016 PURE STORAGE INC.
12

INTEGRATED NETWORKING
SOFTWARE-DEFINED NETWORKING
2x BROADCOM TRIDENT-II
ETHERNET SWITCH ASICS
Collapses three networks – frontend, backend,
and control – into one high-performance fabric
8x 40Gb/s QSFP
Connections into customer
top-of-rack switches
13
FlashBlade Chassis
Up to 15 Blades
4RU Height
N+2 Redundant, Heals in Place
Blades
Capacity & Performance
DirectFlash NAND
Embedded NVRAM

FLASHBLADE
BLADE
INTEL XEON
SYSTEM-ON-A-CHIP
Compute + Networking + Chipset
Low-Power, Low-Cost Design
8x Full XEON Cores
DRAM
MEMORY
PROGRAMMABLE
PROCESSORS
FPGA
NAND FLASH
17TB or 52TB
(per Blade)
INTEGRATED
NV-RAM
Supercapacitor-backed
write buffer
PCIE CONNECTIVITY
CPUs & Flash communicate via
custom protocol over PCIe

SOUL OF FLASHBLADE IS PARALLEL
POWERING 75 BLADE-SCALE IN SINGLE IP WITH PURITY FOR FLASHBLADE
KEY-VALUE DATABASE STORE FOR DISTRIBUTED PARTITIONS
KEY
VALUE
BILLIONS&
BILLIONS
OF OBJECTS
NATIVE OBJECT NATIVE NFS/SMB

MODERNIZE YOUR DATA EXPERIENCE
Expedite & automate
troubleshooting
VM Analytics
Plan with ease
AI driven workload planner
Take the guesswork out
of management
Cloud based management
Global information at
your fingertips
Pure1 mobile app

William McKnight
• Frequent keynote speaker and trainer internationally
• Consulted to many Global 1000 companies
• Hundreds of articles, blogs and white papers in publication
• Focused on delivering business value and solving business
problems utilizing proven, streamlined approaches to
information management
• Former Database Engineer, Fortune 50 Information
Technology executive and Ernst&Young Entrepreneur of
Year Finalist
• Owner/consultant: 2018 and 2017 Inc. 5000 strategy &
implementation consulting firm
• 30 years of information management and DBMS experience

McKnight Consulting Group Offerings
Strategy
Training
Strategy
 Trusted Advisor
 Action Plans
 Roadmaps
 Tool Selections
 Program Management
Training
 Classes
 Workshops
Implementation
 Data/Data Warehousing/Business
Intelligence/Analytics
 Master Data Management
 Governance/Quality
 Big Data
Implementation
3

Why Are Trends Important?
• It is imperative to see trends that affect your
business to know how to respond
• Plan for and deal with change
• Better to be at the beginning of the trend rather
than the end
• Wants, needs, and tastes of your customer changes
• Make you a leader, not a follower
• Grow your business ideas
• Give you ideas what to improve in your business

Information Management Leaders
• Information Management leaders of tomorrow
can advance maturity while also solving
business issues
– There’s no budget for “staying on trends”
• Information Management leaders must pick
their winning (i.e., multi-year sustainable)
approaches and get on board

The Money Tree Doesn’t Exist
Hitch your Trend Pursuit
Efforts to a Budget Delivering ROI
6

Those Who Were Less Impacted by 2020
• Cloud-First
• Microservices-Based
• Data is a separate function
• Agile Development
• Master Data
7

Last Year’s Trends
• Data Takes Steps to the Balance Sheet
• Explosion in Sensor-Based Time-Series Data
• Business Intelligence Interfaces Upheaval
• ETL will be Nearly Automated
• Cloud Object Storage
• More Edge AI
• Data’s New Highest Use Will Be Training AI Algorithms
• Explainable AI
• Kubernetes and Containers
• Hybrid Databases
9

Factors to Watch in 2021
• Pandemic Footprint
• Vaccine Rollout
• Resiliency of Corporations
• Prioritization of Forward Factors
• Continued Preparedness Awareness
10

Top Trends in Enterprise Analytics for
2021 and Beyond

Remote Work Continues
• Some projects done all remote
• Or multiple people to 1 seat arrangements
• Remote Conferences
• Some Offices Prepare for Return

Led by Cloud Capabilities, Strong Tech Spending
Rebound in 2021
• CXOs ready to release floodgates
• Storage strong growth
– AWS Storage Revenue Approaching $10B
• Artificial Intelligence, Kubernetes Approaches
and Automation are driving corporate tech
budgets

Leading Organizations are increasing a focus on
AI/ML
• Budgets for AI/ML increasing significantly
• Beyond Initial Use Cases
• Model Expansion in Production
14

Leading Organizations are increasing a focus on
AI/ML
• Collaborative AI
• Human/AI Hybrid Solutions
15

Model Deployment Takes Center Stage
• Model Deployment Will Rise to the Top Activity
of Data Professionals
• Data Scientists Will Continue to Wrangle Data
Since Most Data Environments are Mid/Low
Maturity
• Models Getting More Sophisticated
– Data Wrangling Increasing
– Continued Challenges to Data Maturity
• Organizations will struggle without MLOps
16

• Embedded Databases at the edge
• AI baked into the chips
• Decision making at the edge
More Edge AI

• High-Performance Edge AI
• Real-Time Data Wrangling
More Edge AI

• Combatting Bias
• Responsible AI
• Regulations
• This trend will evaporate in time
Explainable AI

Data Lakes
• The Rise of the LakeHouse
• Explosion in Sensor-Based Time-Series Data and
Edge AI
• Leveraging Cloud Storage for Data Lakes
• Data Integration Automation
20

New Technology Stacks: Shift from only data warehouses, lakes,
and ETL to data fabrics, AI, and pipelines
21

DEVOps
• Continuous Delivery
• Security in the Pipelines
• Visibility into the processes

MLOps
• MLOps applies DevOps principles to ML delivery
• The ML process primarily revolves around creating, training and deploying
models
• Once trained and validated, models are deployed into an architecture that
can deal with large quantities of (often streamed) data, to enable insights to
be derived
• Development of such models can benefit from an iterative approach, so the
domain can be better understood, and the models improved
• It also then needs a highly automated pipeline of tools, repositories to store
and keep track of models, code, data lineage and a target environment
which can be deployed into at speed
• The result is an ML-enabled application: MLOps requires data scientists to
work alongside developers, and can therefore be seen as an extension of
DevOps to encompass the data and models used for ML
23

• Automated Data Discovery
• Auto-generated pipelines based on global
experiences
• Joins by data
• Key variables updated with each new data point
• That, in turn, automatically execute the proper
next best action
• Next best action determined by AI
• Enterprises will automate data cataloguing and
profiling
Automation

• Lack of and expensive data engineers
• More vendors rearchitecting to open
source
• Vendors to compete on customer
satisfaction and execution
Open Source

• Data analytics stack goes Kubernetes for
both open source and commercial
• Winners go from thought to POC quickly
• Serverlessness
Kubernetes

We are at the start of General AI
• GPT-3 has opened a new chapter in machine learning.
– Its most striking feature is its generality.
– Only a few years ago, neural networks were built with functions
tuned to a specific task, such as translation or question answering.
Datasets were curated to reflect that task.
– GPT-3 has no task-specific functions, and it needs no special
dataset. It simply utilizes as much text as possible and plays
forward its output.
• Somehow, in the calculation of the conditional probability
distribution across all those gigabytes of text, a function
emerges that can produce answers that are competitive on
any number of tasks.
• It is a breathtaking triumph of simplicity that probably has
many years of achievement ahead of it.
27

 There’s more maturity
in moving imperfectly
than in merely
perfectly defining the
shortcomings
 Build credibility
 Don’t be afraid to fail
 Don’t talk yourself out
of having a new
beginning
Have an open mind
No plateaus are
comfortable for long
That resistance is not
about making
progress, it’s the
journey

Winning Approaches in 2021
• Cloud Computing
• Artificial Intelligence
• Data Lakes
• Data Warehousing
• Master Data Management
• Agile Development
• Kubernetes
• Automation
• Data Quality
• Graph Data
• Organizational Change Management
• DevOps and MLOps
• Data Catalogs
• Data Governance

2021 Trends in Enterprise
Advanced Analytics
Presented by: William McKnight
“#1 Global Influencer in Data Warehousing” Onalytica
A 2 Time Inc. 5000 Company
@williammcknight
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET

ADV Slides: 2021 Trends in Enterprise Analytics

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to ADV Slides: 2021 Trends in Enterprise Analytics

Similar to ADV Slides: 2021 Trends in Enterprise Analytics (20)

More from DATAVERSITY

More from DATAVERSITY (20)

Recently uploaded

Recently uploaded (20)

ADV Slides: 2021 Trends in Enterprise Analytics