SlideShare a Scribd company logo
1 of 38
Download to read offline
Platforming Your Data for
Success
Presented by: William McKnight
President, McKnight Consulting Group
williammcknight
www.mcknightcg.com
(214) 514-1444
William McKnight
President, McKnight Consulting Group
• Frequent keynote speaker and trainer internationally
• Consulted to many Global 1000 companies
• Hundreds of articles, blogs, white papers, field tests, etc.
in publication
• Focused on delivering business value and solving business
problems utilizing proven, streamlined approaches to
information management
• Former Database Engineer, Fortune 50 Information
Technology executive and Ernst&Young Entrepreneur of
Year Finalist
• Owner/consultant: Data strategy and implementation
consulting firm
• 25+ years of information management and data
experience
2
McKnight Consulting Group Offerings
Strategy
Training
Strategy
 Trusted Advisor
 Action Plans
 Roadmaps
 Tool Selections
 Program Management
Training
 Classes
 Workshops
Implementation
 Data/Data Warehousing/Business
Intelligence/Analytics
 Master Data Management
 Governance/Quality
 Big Data
Implementation
3
This guy has nothing on us
4
2000’s
•
2010’s+
Give Me
All Data
Fast &
Effectively!
Give Me
Good Data
But Do It
Efficiently!
1990’s
Just Give Me
Some Data
and Fast!
All Data!
5
AI Data
• Call center recordings and chat logs
• Streaming sensor data, historical maintenance records and
search logs
• Customer account data and purchase history
• Email response metrics
• Product catalogs and data sheets
• Public references
• YouTube video content audio tracks
• User website behaviors
• Sentiment analysis, user-generated content, social graph data,
and other external data sources
6
Priorities
Best Category and Top Tool Picked
Best Category Picked
Top 2 Category Picked
Same Ol’ Platform
80%
70%
60%
50%
Increasing Probability that Platform
Selection Leads to Success
What is it?
• Operational Database
– Operational Real-Time
– Operational Big Data
• Operational Data Hub
• Master Data Management
• A Data Warehouse
• A Data Mart
– Dependent
– Independent
• A Data Lake
• Analytic Application
– Analytic Big Data Application
• Archive Storage
• A Staging Area
9
4 Major Decisions
• Decision #1: The Data Store Type
– The largest factor for distinguishing between databases and file-based scale-out system utilization is the data profile. The latter is best for
data that fits the loose label of 'unstructured' (or semi-structured) data, while more traditional data -- and smaller volumes of all data -- still
belong in a relational database.
• Decision #2: Data Store Placement
– You must also decide where to place your data store -- on-premises or in the cloud (and which cloud). In the past, the only clear choice for
most organizations was on-premises data. However, the costs of scale are gnawing away at the notion that this remains the best approach
for a data platform. For more on why databases are moving to the cloud, please read this article.
• Decision #3: The Workload Architecture
– You must keep in mind the distinction between operational or analytical workloads. Short transactional requests and more complex (often
longer) analytics requests demand different architectures. Analytics databases, though quite diverse, are the preferred platforms for the
analytics workload.
• Decision #4: The Node Architecture
– General purpose to premium and HDD to flash and all memory storage. Volume types, readwrite, cache options. Balance CPU and storage
on the nodes. Levels of IOPS. Levels of management.
10
Data Warehouses, Data Marts,
Data Lakes, Big Data
Data Warehousing
• Data Warehouses (still) have a lower
total cost of ownership than data
marts
• A data warehouse is a SHARED
platform
– Build once, use many
– Access at Data Warehouse
– Access by creating a mart off the DW
• Still A LOT cheaper than building from scratch
“… a subject-
oriented, integrated,
non-volatile, time-
variant collection of
data, organized to
support
management
needs.” — Bill Inmon
On Relational
• Consistency
• Transactions
• Partitioning
• Arrays
• Inheritance
• UNION
• Columnar
• Storage Fluidity
• Custom data types
• Built in graph capabilities
• Caching
13
The Analytic Data Ecosystem
Data Lake
DW
DM
DM
14
Data Warehouses Have Flavors
● The Customer Experience Transformation Data Warehouse focuses on
customer attributes and touchpoints to improve the value of
customers.
● The Asset Maximization with IoT data warehouse deals with the high
volume of edge data tracking the physical assets of the organization.
● The Operational Extension Data Warehouse supports company
operations directly with real- time analytics.
● The Risk Management Data Warehouse supports the ever-growing
compliance and reporting requirements and corporate risk.
● The Finance Modernization Data Warehouse handles the voluminous
financial reporting and ensures the bottom line is considered in every
aspect of the business.
● The Product Innovation Data Warehouse delivers all product-related
information into the decisions of the product life cycle.
Required for Modern Analytics
• In-database analytics
• In-memory capabilities
• Columnar orientation
• Modern programming languages
• New data types
16
Columnar Orientation
17
Object Storage Instances
• Object Storage instances/clusters have local
storage, i.e., on the physical drives mounted to the
instances themselves, that is HDFS and Hive
• Object Storage technologies access their cloud
vendor’s respective cloud storage—viz.:
– Amazon EMR accesses S3
– Dataproc accesses Google Cloud Storage
– HDI accesses Azure Data Lake Storage Gen2
• Local storage is used by the Object Storage
platform for housekeeping
18
Data Lakes with Analytic Access Pricing
• Pair a lake with an analytical engine that charges
only by what you use
• If you have a ton of data that can sit in cold storage
and only needs to be accessed or analyzed
occasionally, store it in Amazon S3/Azure Blob
Storage/Google Cloud Storage
– Use a database (on-premise or in the cloud) that can
create external tables that point at the storage
– Analysts can query directly against it, or draw down a
subset for some deeper/intensive analysis
– The GB/month storage fee plus data transfer/egress
fees will be much cheaper than leaving it in a data
warehouse
19
Analytics Reference Architecture
Logs
(Apps, Web,
Devices)
User tracking
Operational
Metrics
Offload
data
Raw Data Topics
JSON, AVRO
Processed
Data Topics
Sensors
and
/ or
Transactiona
l/ Context
Data
OLTP/ODS
ETL
Or
EL with
T in Spark
Batch
Low
Latency
Applications
Files
In-
database
analytics
Reach
through
or ETL/ELT
or
Stream
Processing
or
Stream
Processing
Q
Q
Data
Warehouse
Notes on the Data Warehouse of the Future
• More Achievable separate compute and storage architecture
• Compute resources (Map/Reduce, Hive, Spark, etc.) can be taken down,
scaled up or out, or interchanged without data movement
• Storage can be centralized, but compute can be distributed
• Major players have mechanism to ensure consistency to achieve ACID-like
compliance
• Remote data replication to ensure redundancy and recovery
• Most of the query execution is processing time, and not data transport, so if
cloud compute and storage are in the same cloud vendor region,
performance is hardly impacted
21
Cloud Analytic Databases
Disruption Vectors
• Robustness of SQL
• Built-in optimization
• On-the-fly elasticity
• Dynamic Environment Adaption
• Separation of compute from storage
• Support for diverse data
23
Cloud Analytic Databases in the Enterprise
• Can be used for test/dev or prod; disaster recovery; bursting
• CAPEX accounting
• The cloud now offers attractive options with better
economics, such as pay-as-you-go which is easier to justify
and budget, better logistics (streamlined administration and
management), and better scale (elasticity and the ability to
expand a cluster within minutes).
• While on-premises-first development brings a robust
database to the table, not all functions are always part of the
cloud solution and not all of the organizations behind them
have made the transition to cloud.
• Data gravity in the cloud.
24
Performance
• Managed cloud databases are the winner for
performance
• Querying cloud storage directly is inefficient and
bringing subsets of data down for on-premise
processing takes time and costs egress fees
• Performance testing on Hadoop engines like Hive,
Spark, and Impala have shown improvements in
performance, but they still lag significantly behind
the performance and power of a solid relational
cloud database/data warehouse
25
Administration
• Managed cloud databases win this category too.
• Many of the latest and greatest fully-managed cloud
database platforms are streamlining and subsuming
much of the DBA work these days. Things like indexes,
constraints, partitioning, and other DBA-level
performance tuning are fading away.
• Second is cloud storage, because of its very simple
architecture.
• Last place in Administration is Hadoop. You will still need
expertise to help diagnose why Spark executors fail or
Hive throws an exception or why troublesome queries
never finish.
26
However… Why Big Data Technologies for Big
Data
• New Data Types
• Schemaless
• Relaxed ACID
• Faster, Less Expensive Provisioning
• Programmer Freedoms
• Fault-Tolerant Redundancy
• Scale Out (to Webscale)
• Automatic Sharding
Data Lake
Data Scientist Workbench and Data Warehouse
Staging
OLTP
Systems
Data Lake
Data Scientists
ERP
CRM
Supply
Chain
MDM
…
Data
Warehouse
Data Mart
Stream or
Batch
Updates
DI
Real-Time,
Event-Driven
Apps
28
HDFS vs Cloud Storage
• Cloud Storage is more scalable and persistent
• Cloud Storage is backed up and supports
compression, making the cost of big data less
• HDFS has better query performance
• Cloud Storage has object size and single PUT
limits that need workarounds
29
Leveraging Cloud Storage for Data Lakes
• More Achievable separate compute and storage architecture
• Compute resources (Map/Reduce, Hive, Spark, etc.) can be taken
down, scaled up or out, or interchanged without data movement
• Storage can be centralized, but compute can be distributed
• Major players have mechanism to ensure consistency to achieve
ACID-like compliance for remote data changes
• Some vendors also have remote data replication to ensure
redundancy and recovery
• Most of the query execution is processing time, and not data
transport, so if cloud compute and storage are in the same cloud
vendor region, performance is hardly impacted
30
Graph Databases
How to Identify a Graph Workload
• Workload is identified by “network, hierarchy,
tree, ancestry, structure” words
• You are planning to use the relational
performance tricks
• Your queries will be about pathing
• You are limiting queries by their complexity
• A quick POC with a graph database impresses
• You are looking for “non-obvious” patterns in
the data
32
Graph Databases
Bridge
vertex
Bridge
vertex
33
Future
GPU Databases
35
• A GPU Database performs at least some
operations using the GPU
• Uses SQL
• Uses each GPUs local memory store,
which is used as a data cache that
operates many times faster than the CPU
cache or main memory itself
Operlytical Databases
• Combination row-based for transactions and
column-based for analytics
• Can process both orders and machine learning
models simultaneously with fast performance
and reduced complexity
36
Decentralized, decoupled, distributed
architectures
• Data Infrastructure as a Platform with complete domain
mastery as nodes
• Enterprise master data management
• Solving the federation challenge; nobody has done it yet,
someone will and it could be really big
• Moving away from conventional integration and its
technical debt and effort
• Containerization, microservice databases, and
embedded databases as part of the analytics
environment
• Integration speed uptake and maturity eliminating
redundant data stores
• Unification of batch and streaming and tools
37
Platforming Your Data for
Success
Presented by: William McKnight
President, McKnight Consulting Group
williammcknight
www.mcknightcg.com
(214) 514-1444

More Related Content

What's hot

Trends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsTrends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsDATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Cloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitCloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitMing Yuan
 
Information & Data Architecture
Information & Data ArchitectureInformation & Data Architecture
Information & Data ArchitectureSammer Qader
 
DAS Slides: Data Virtualization – Separating Myth from Reality
DAS Slides: Data Virtualization – Separating Myth from RealityDAS Slides: Data Virtualization – Separating Myth from Reality
DAS Slides: Data Virtualization – Separating Myth from RealityDATAVERSITY
 
Slides: Enterprise Architecture vs. Data Architecture
Slides: Enterprise Architecture vs. Data ArchitectureSlides: Enterprise Architecture vs. Data Architecture
Slides: Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
Data-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsData-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsDATAVERSITY
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeDATAVERSITY
 
Data Quality Strategies
Data Quality StrategiesData Quality Strategies
Data Quality StrategiesDATAVERSITY
 
RWDG Slides: Data Governance and Three Levels of Metadata Management
RWDG Slides: Data Governance and Three Levels of Metadata ManagementRWDG Slides: Data Governance and Three Levels of Metadata Management
RWDG Slides: Data Governance and Three Levels of Metadata ManagementDATAVERSITY
 
ADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and ComparisonADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and ComparisonDATAVERSITY
 
Becoming a Data-Driven Organization - Aligning Business & Data Strategy
Becoming a Data-Driven Organization - Aligning Business & Data StrategyBecoming a Data-Driven Organization - Aligning Business & Data Strategy
Becoming a Data-Driven Organization - Aligning Business & Data StrategyDATAVERSITY
 
DataEd Webinar: Metadata Strategies
DataEd Webinar:  Metadata StrategiesDataEd Webinar:  Metadata Strategies
DataEd Webinar: Metadata StrategiesDATAVERSITY
 
Data Architecture - The Foundation for Enterprise Architecture and Governance
Data Architecture - The Foundation for Enterprise Architecture and GovernanceData Architecture - The Foundation for Enterprise Architecture and Governance
Data Architecture - The Foundation for Enterprise Architecture and GovernanceDATAVERSITY
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
Everybody is a Data Steward – Get Over It!
Everybody is a Data Steward – Get Over It!Everybody is a Data Steward – Get Over It!
Everybody is a Data Steward – Get Over It!DATAVERSITY
 
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeData Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeDATAVERSITY
 
Slides: Why You Need End-to-End Data Quality to Build Trust in Kafka
Slides: Why You Need End-to-End Data Quality to Build Trust in KafkaSlides: Why You Need End-to-End Data Quality to Build Trust in Kafka
Slides: Why You Need End-to-End Data Quality to Build Trust in KafkaDATAVERSITY
 
DataEd Slides: Data Modeling is Fundamental
DataEd Slides:  Data Modeling is FundamentalDataEd Slides:  Data Modeling is Fundamental
DataEd Slides: Data Modeling is FundamentalDATAVERSITY
 
Drive your business with predictive analytics
Drive your business with predictive analyticsDrive your business with predictive analytics
Drive your business with predictive analyticsThe Marketing Distillery
 

What's hot (20)

Trends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsTrends in Enterprise Advanced Analytics
Trends in Enterprise Advanced Analytics
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Cloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitCloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummit
 
Information & Data Architecture
Information & Data ArchitectureInformation & Data Architecture
Information & Data Architecture
 
DAS Slides: Data Virtualization – Separating Myth from Reality
DAS Slides: Data Virtualization – Separating Myth from RealityDAS Slides: Data Virtualization – Separating Myth from Reality
DAS Slides: Data Virtualization – Separating Myth from Reality
 
Slides: Enterprise Architecture vs. Data Architecture
Slides: Enterprise Architecture vs. Data ArchitectureSlides: Enterprise Architecture vs. Data Architecture
Slides: Enterprise Architecture vs. Data Architecture
 
Data-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsData-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling Fundamentals
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
 
Data Quality Strategies
Data Quality StrategiesData Quality Strategies
Data Quality Strategies
 
RWDG Slides: Data Governance and Three Levels of Metadata Management
RWDG Slides: Data Governance and Three Levels of Metadata ManagementRWDG Slides: Data Governance and Three Levels of Metadata Management
RWDG Slides: Data Governance and Three Levels of Metadata Management
 
ADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and ComparisonADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and Comparison
 
Becoming a Data-Driven Organization - Aligning Business & Data Strategy
Becoming a Data-Driven Organization - Aligning Business & Data StrategyBecoming a Data-Driven Organization - Aligning Business & Data Strategy
Becoming a Data-Driven Organization - Aligning Business & Data Strategy
 
DataEd Webinar: Metadata Strategies
DataEd Webinar:  Metadata StrategiesDataEd Webinar:  Metadata Strategies
DataEd Webinar: Metadata Strategies
 
Data Architecture - The Foundation for Enterprise Architecture and Governance
Data Architecture - The Foundation for Enterprise Architecture and GovernanceData Architecture - The Foundation for Enterprise Architecture and Governance
Data Architecture - The Foundation for Enterprise Architecture and Governance
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Everybody is a Data Steward – Get Over It!
Everybody is a Data Steward – Get Over It!Everybody is a Data Steward – Get Over It!
Everybody is a Data Steward – Get Over It!
 
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeData Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
 
Slides: Why You Need End-to-End Data Quality to Build Trust in Kafka
Slides: Why You Need End-to-End Data Quality to Build Trust in KafkaSlides: Why You Need End-to-End Data Quality to Build Trust in Kafka
Slides: Why You Need End-to-End Data Quality to Build Trust in Kafka
 
DataEd Slides: Data Modeling is Fundamental
DataEd Slides:  Data Modeling is FundamentalDataEd Slides:  Data Modeling is Fundamental
DataEd Slides: Data Modeling is Fundamental
 
Drive your business with predictive analytics
Drive your business with predictive analyticsDrive your business with predictive analytics
Drive your business with predictive analytics
 

Similar to ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Hadoop, Cloud Storage, Excel?

ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...DATAVERSITY
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationDATAVERSITY
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse OptimizationCloudera, Inc.
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Kent Graziano
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsDATAVERSITY
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeDATAVERSITY
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Precisely
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Denodo
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMichael Hiskey
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricNathan Bijnens
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsCloudera, Inc.
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
 
TECHunplugged Austin 2016
TECHunplugged Austin 2016TECHunplugged Austin 2016
TECHunplugged Austin 2016Chris Evans
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Denodo
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Denodo
 

Similar to ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Hadoop, Cloud Storage, Excel? (20)

ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic Solutions
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-Purpose
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
TECHunplugged Austin 2016
TECHunplugged Austin 2016TECHunplugged Austin 2016
TECHunplugged Austin 2016
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
 

More from DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

More from DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Recently uploaded

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 

Recently uploaded (20)

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 

ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Hadoop, Cloud Storage, Excel?

  • 1. Platforming Your Data for Success Presented by: William McKnight President, McKnight Consulting Group williammcknight www.mcknightcg.com (214) 514-1444
  • 2. William McKnight President, McKnight Consulting Group • Frequent keynote speaker and trainer internationally • Consulted to many Global 1000 companies • Hundreds of articles, blogs, white papers, field tests, etc. in publication • Focused on delivering business value and solving business problems utilizing proven, streamlined approaches to information management • Former Database Engineer, Fortune 50 Information Technology executive and Ernst&Young Entrepreneur of Year Finalist • Owner/consultant: Data strategy and implementation consulting firm • 25+ years of information management and data experience 2
  • 3. McKnight Consulting Group Offerings Strategy Training Strategy  Trusted Advisor  Action Plans  Roadmaps  Tool Selections  Program Management Training  Classes  Workshops Implementation  Data/Data Warehousing/Business Intelligence/Analytics  Master Data Management  Governance/Quality  Big Data Implementation 3
  • 4. This guy has nothing on us 4
  • 5. 2000’s • 2010’s+ Give Me All Data Fast & Effectively! Give Me Good Data But Do It Efficiently! 1990’s Just Give Me Some Data and Fast! All Data! 5
  • 6. AI Data • Call center recordings and chat logs • Streaming sensor data, historical maintenance records and search logs • Customer account data and purchase history • Email response metrics • Product catalogs and data sheets • Public references • YouTube video content audio tracks • User website behaviors • Sentiment analysis, user-generated content, social graph data, and other external data sources 6
  • 8. Best Category and Top Tool Picked Best Category Picked Top 2 Category Picked Same Ol’ Platform 80% 70% 60% 50% Increasing Probability that Platform Selection Leads to Success
  • 9. What is it? • Operational Database – Operational Real-Time – Operational Big Data • Operational Data Hub • Master Data Management • A Data Warehouse • A Data Mart – Dependent – Independent • A Data Lake • Analytic Application – Analytic Big Data Application • Archive Storage • A Staging Area 9
  • 10. 4 Major Decisions • Decision #1: The Data Store Type – The largest factor for distinguishing between databases and file-based scale-out system utilization is the data profile. The latter is best for data that fits the loose label of 'unstructured' (or semi-structured) data, while more traditional data -- and smaller volumes of all data -- still belong in a relational database. • Decision #2: Data Store Placement – You must also decide where to place your data store -- on-premises or in the cloud (and which cloud). In the past, the only clear choice for most organizations was on-premises data. However, the costs of scale are gnawing away at the notion that this remains the best approach for a data platform. For more on why databases are moving to the cloud, please read this article. • Decision #3: The Workload Architecture – You must keep in mind the distinction between operational or analytical workloads. Short transactional requests and more complex (often longer) analytics requests demand different architectures. Analytics databases, though quite diverse, are the preferred platforms for the analytics workload. • Decision #4: The Node Architecture – General purpose to premium and HDD to flash and all memory storage. Volume types, readwrite, cache options. Balance CPU and storage on the nodes. Levels of IOPS. Levels of management. 10
  • 11. Data Warehouses, Data Marts, Data Lakes, Big Data
  • 12. Data Warehousing • Data Warehouses (still) have a lower total cost of ownership than data marts • A data warehouse is a SHARED platform – Build once, use many – Access at Data Warehouse – Access by creating a mart off the DW • Still A LOT cheaper than building from scratch “… a subject- oriented, integrated, non-volatile, time- variant collection of data, organized to support management needs.” — Bill Inmon
  • 13. On Relational • Consistency • Transactions • Partitioning • Arrays • Inheritance • UNION • Columnar • Storage Fluidity • Custom data types • Built in graph capabilities • Caching 13
  • 14. The Analytic Data Ecosystem Data Lake DW DM DM 14
  • 15. Data Warehouses Have Flavors ● The Customer Experience Transformation Data Warehouse focuses on customer attributes and touchpoints to improve the value of customers. ● The Asset Maximization with IoT data warehouse deals with the high volume of edge data tracking the physical assets of the organization. ● The Operational Extension Data Warehouse supports company operations directly with real- time analytics. ● The Risk Management Data Warehouse supports the ever-growing compliance and reporting requirements and corporate risk. ● The Finance Modernization Data Warehouse handles the voluminous financial reporting and ensures the bottom line is considered in every aspect of the business. ● The Product Innovation Data Warehouse delivers all product-related information into the decisions of the product life cycle.
  • 16. Required for Modern Analytics • In-database analytics • In-memory capabilities • Columnar orientation • Modern programming languages • New data types 16
  • 18. Object Storage Instances • Object Storage instances/clusters have local storage, i.e., on the physical drives mounted to the instances themselves, that is HDFS and Hive • Object Storage technologies access their cloud vendor’s respective cloud storage—viz.: – Amazon EMR accesses S3 – Dataproc accesses Google Cloud Storage – HDI accesses Azure Data Lake Storage Gen2 • Local storage is used by the Object Storage platform for housekeeping 18
  • 19. Data Lakes with Analytic Access Pricing • Pair a lake with an analytical engine that charges only by what you use • If you have a ton of data that can sit in cold storage and only needs to be accessed or analyzed occasionally, store it in Amazon S3/Azure Blob Storage/Google Cloud Storage – Use a database (on-premise or in the cloud) that can create external tables that point at the storage – Analysts can query directly against it, or draw down a subset for some deeper/intensive analysis – The GB/month storage fee plus data transfer/egress fees will be much cheaper than leaving it in a data warehouse 19
  • 20. Analytics Reference Architecture Logs (Apps, Web, Devices) User tracking Operational Metrics Offload data Raw Data Topics JSON, AVRO Processed Data Topics Sensors and / or Transactiona l/ Context Data OLTP/ODS ETL Or EL with T in Spark Batch Low Latency Applications Files In- database analytics Reach through or ETL/ELT or Stream Processing or Stream Processing Q Q Data Warehouse
  • 21. Notes on the Data Warehouse of the Future • More Achievable separate compute and storage architecture • Compute resources (Map/Reduce, Hive, Spark, etc.) can be taken down, scaled up or out, or interchanged without data movement • Storage can be centralized, but compute can be distributed • Major players have mechanism to ensure consistency to achieve ACID-like compliance • Remote data replication to ensure redundancy and recovery • Most of the query execution is processing time, and not data transport, so if cloud compute and storage are in the same cloud vendor region, performance is hardly impacted 21
  • 23. Disruption Vectors • Robustness of SQL • Built-in optimization • On-the-fly elasticity • Dynamic Environment Adaption • Separation of compute from storage • Support for diverse data 23
  • 24. Cloud Analytic Databases in the Enterprise • Can be used for test/dev or prod; disaster recovery; bursting • CAPEX accounting • The cloud now offers attractive options with better economics, such as pay-as-you-go which is easier to justify and budget, better logistics (streamlined administration and management), and better scale (elasticity and the ability to expand a cluster within minutes). • While on-premises-first development brings a robust database to the table, not all functions are always part of the cloud solution and not all of the organizations behind them have made the transition to cloud. • Data gravity in the cloud. 24
  • 25. Performance • Managed cloud databases are the winner for performance • Querying cloud storage directly is inefficient and bringing subsets of data down for on-premise processing takes time and costs egress fees • Performance testing on Hadoop engines like Hive, Spark, and Impala have shown improvements in performance, but they still lag significantly behind the performance and power of a solid relational cloud database/data warehouse 25
  • 26. Administration • Managed cloud databases win this category too. • Many of the latest and greatest fully-managed cloud database platforms are streamlining and subsuming much of the DBA work these days. Things like indexes, constraints, partitioning, and other DBA-level performance tuning are fading away. • Second is cloud storage, because of its very simple architecture. • Last place in Administration is Hadoop. You will still need expertise to help diagnose why Spark executors fail or Hive throws an exception or why troublesome queries never finish. 26
  • 27. However… Why Big Data Technologies for Big Data • New Data Types • Schemaless • Relaxed ACID • Faster, Less Expensive Provisioning • Programmer Freedoms • Fault-Tolerant Redundancy • Scale Out (to Webscale) • Automatic Sharding
  • 28. Data Lake Data Scientist Workbench and Data Warehouse Staging OLTP Systems Data Lake Data Scientists ERP CRM Supply Chain MDM … Data Warehouse Data Mart Stream or Batch Updates DI Real-Time, Event-Driven Apps 28
  • 29. HDFS vs Cloud Storage • Cloud Storage is more scalable and persistent • Cloud Storage is backed up and supports compression, making the cost of big data less • HDFS has better query performance • Cloud Storage has object size and single PUT limits that need workarounds 29
  • 30. Leveraging Cloud Storage for Data Lakes • More Achievable separate compute and storage architecture • Compute resources (Map/Reduce, Hive, Spark, etc.) can be taken down, scaled up or out, or interchanged without data movement • Storage can be centralized, but compute can be distributed • Major players have mechanism to ensure consistency to achieve ACID-like compliance for remote data changes • Some vendors also have remote data replication to ensure redundancy and recovery • Most of the query execution is processing time, and not data transport, so if cloud compute and storage are in the same cloud vendor region, performance is hardly impacted 30
  • 32. How to Identify a Graph Workload • Workload is identified by “network, hierarchy, tree, ancestry, structure” words • You are planning to use the relational performance tricks • Your queries will be about pathing • You are limiting queries by their complexity • A quick POC with a graph database impresses • You are looking for “non-obvious” patterns in the data 32
  • 35. GPU Databases 35 • A GPU Database performs at least some operations using the GPU • Uses SQL • Uses each GPUs local memory store, which is used as a data cache that operates many times faster than the CPU cache or main memory itself
  • 36. Operlytical Databases • Combination row-based for transactions and column-based for analytics • Can process both orders and machine learning models simultaneously with fast performance and reduced complexity 36
  • 37. Decentralized, decoupled, distributed architectures • Data Infrastructure as a Platform with complete domain mastery as nodes • Enterprise master data management • Solving the federation challenge; nobody has done it yet, someone will and it could be really big • Moving away from conventional integration and its technical debt and effort • Containerization, microservice databases, and embedded databases as part of the analytics environment • Integration speed uptake and maturity eliminating redundant data stores • Unification of batch and streaming and tools 37
  • 38. Platforming Your Data for Success Presented by: William McKnight President, McKnight Consulting Group williammcknight www.mcknightcg.com (214) 514-1444