Submit Search
Upload
Data Regions: Modernizing your company's data ecosystem
•
Download as PPTX, PDF
•
1 like
•
655 views
DataWorks Summit/Hadoop Summit
Follow
Data Regions: Modernizing your company's data ecosystem
Read less
Read more
Technology
Slideshow view
Report
Share
Slideshow view
Report
Share
1 of 23
Download now
Recommended
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
DataWorks Summit
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
DataWorks Summit/Hadoop Summit
What's new in Ambari
What's new in Ambari
DataWorks Summit
Intro to Spark & Zeppelin - Crash Course - HS16SJ
Intro to Spark & Zeppelin - Crash Course - HS16SJ
DataWorks Summit/Hadoop Summit
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
DataWorks Summit/Hadoop Summit
Hadoop 3 in a Nutshell
Hadoop 3 in a Nutshell
DataWorks Summit/Hadoop Summit
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
alanfgates
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
DataWorks Summit/Hadoop Summit
Recommended
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
DataWorks Summit
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
DataWorks Summit/Hadoop Summit
What's new in Ambari
What's new in Ambari
DataWorks Summit
Intro to Spark & Zeppelin - Crash Course - HS16SJ
Intro to Spark & Zeppelin - Crash Course - HS16SJ
DataWorks Summit/Hadoop Summit
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
DataWorks Summit/Hadoop Summit
Hadoop 3 in a Nutshell
Hadoop 3 in a Nutshell
DataWorks Summit/Hadoop Summit
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
alanfgates
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
DataWorks Summit/Hadoop Summit
Red Hat in Financial Services - Presentation at Hortonworks Booth - Strata 2014
Red Hat in Financial Services - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
Interactive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using Druid
DataWorks Summit
Benefits of an Agile Data Fabric for Business Intelligence
Benefits of an Agile Data Fabric for Business Intelligence
DataWorks Summit/Hadoop Summit
Format Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and Parquet
DataWorks Summit
Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...
DataWorks Summit
Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?
DataWorks Summit/Hadoop Summit
The Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture View
DataWorks Summit/Hadoop Summit
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
DataWorks Summit/Hadoop Summit
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
DataWorks Summit
Ingesting Data at Blazing Speed Using Apache Orc
Ingesting Data at Blazing Speed Using Apache Orc
DataWorks Summit
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
DataWorks Summit/Hadoop Summit
Tools and approaches for migrating big datasets to the cloud
Tools and approaches for migrating big datasets to the cloud
DataWorks Summit
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
Keys for Success from Streams to Queries
Keys for Success from Streams to Queries
DataWorks Summit/Hadoop Summit
YARN Ready: Apache Spark
YARN Ready: Apache Spark
Hortonworks
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
DataWorks Summit/Hadoop Summit
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
DataWorks Summit/Hadoop Summit
Stinger Initiative - Deep Dive
Stinger Initiative - Deep Dive
Hortonworks
Big data at United Airlines
Big data at United Airlines
DataWorks Summit
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding Overview
Splunk
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
Splunk
More Related Content
What's hot
Red Hat in Financial Services - Presentation at Hortonworks Booth - Strata 2014
Red Hat in Financial Services - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
Interactive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using Druid
DataWorks Summit
Benefits of an Agile Data Fabric for Business Intelligence
Benefits of an Agile Data Fabric for Business Intelligence
DataWorks Summit/Hadoop Summit
Format Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and Parquet
DataWorks Summit
Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...
DataWorks Summit
Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?
DataWorks Summit/Hadoop Summit
The Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture View
DataWorks Summit/Hadoop Summit
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
DataWorks Summit/Hadoop Summit
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
DataWorks Summit
Ingesting Data at Blazing Speed Using Apache Orc
Ingesting Data at Blazing Speed Using Apache Orc
DataWorks Summit
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
DataWorks Summit/Hadoop Summit
Tools and approaches for migrating big datasets to the cloud
Tools and approaches for migrating big datasets to the cloud
DataWorks Summit
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
Keys for Success from Streams to Queries
Keys for Success from Streams to Queries
DataWorks Summit/Hadoop Summit
YARN Ready: Apache Spark
YARN Ready: Apache Spark
Hortonworks
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
DataWorks Summit/Hadoop Summit
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
DataWorks Summit/Hadoop Summit
Stinger Initiative - Deep Dive
Stinger Initiative - Deep Dive
Hortonworks
Big data at United Airlines
Big data at United Airlines
DataWorks Summit
What's hot
(20)
Red Hat in Financial Services - Presentation at Hortonworks Booth - Strata 2014
Red Hat in Financial Services - Presentation at Hortonworks Booth - Strata 2014
Interactive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using Druid
Benefits of an Agile Data Fabric for Business Intelligence
Benefits of an Agile Data Fabric for Business Intelligence
Format Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and Parquet
Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...
Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?
The Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture View
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Ingesting Data at Blazing Speed Using Apache Orc
Ingesting Data at Blazing Speed Using Apache Orc
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
Tools and approaches for migrating big datasets to the cloud
Tools and approaches for migrating big datasets to the cloud
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
Keys for Success from Streams to Queries
Keys for Success from Streams to Queries
YARN Ready: Apache Spark
YARN Ready: Apache Spark
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
Stinger Initiative - Deep Dive
Stinger Initiative - Deep Dive
Big data at United Airlines
Big data at United Airlines
Similar to Data Regions: Modernizing your company's data ecosystem
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding Overview
Splunk
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
Splunk
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
Denodo
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Cambridge Semantics
Achieving a 360 degree view of manufacturing
Achieving a 360 degree view of manufacturing
DataWorks Summit
Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data Warehouse
Ryan Andhavarapu
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
AWS User Group Kochi
Hadoop and Manufacturing
Hadoop and Manufacturing
Cloudera, Inc.
Introduction Big Data
Introduction Big Data
Frank Kienle
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Pentaho
Logical Data Fabric: Architectural Components
Logical Data Fabric: Architectural Components
Denodo
Fbdl enabling comprehensive_data_services
Fbdl enabling comprehensive_data_services
Cindy Irby
Rabobank - There is something about Data
Rabobank - There is something about Data
BigDataExpo
CWIN17 India / Bigdata architecture yashowardhan sowale
CWIN17 India / Bigdata architecture yashowardhan sowale
Capgemini
Modern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
Denodo
Balancing data democratization with comprehensive information governance: bui...
Balancing data democratization with comprehensive information governance: bui...
DataWorks Summit
Datawarehousing
Datawarehousing
work
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
Denodo
Ask bigger questions
Ask bigger questions
South West Data Meetup
Data lake benefits
Data lake benefits
Ricky Barron
Similar to Data Regions: Modernizing your company's data ecosystem
(20)
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Achieving a 360 degree view of manufacturing
Achieving a 360 degree view of manufacturing
Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data Warehouse
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
Hadoop and Manufacturing
Hadoop and Manufacturing
Introduction Big Data
Introduction Big Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Logical Data Fabric: Architectural Components
Logical Data Fabric: Architectural Components
Fbdl enabling comprehensive_data_services
Fbdl enabling comprehensive_data_services
Rabobank - There is something about Data
Rabobank - There is something about Data
CWIN17 India / Bigdata architecture yashowardhan sowale
CWIN17 India / Bigdata architecture yashowardhan sowale
Modern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
Balancing data democratization with comprehensive information governance: bui...
Balancing data democratization with comprehensive information governance: bui...
Datawarehousing
Datawarehousing
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
Ask bigger questions
Ask bigger questions
Data lake benefits
Data lake benefits
More from DataWorks Summit/Hadoop Summit
Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
DataWorks Summit/Hadoop Summit
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
DataWorks Summit/Hadoop Summit
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
DataWorks Summit/Hadoop Summit
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
DataWorks Summit/Hadoop Summit
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
DataWorks Summit/Hadoop Summit
Hadoop Crash Course
Hadoop Crash Course
DataWorks Summit/Hadoop Summit
Data Science Crash Course
Data Science Crash Course
DataWorks Summit/Hadoop Summit
Apache Spark Crash Course
Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
Dataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
Schema Registry - Set you Data Free
Schema Registry - Set you Data Free
DataWorks Summit/Hadoop Summit
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
DataWorks Summit/Hadoop Summit
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
DataWorks Summit/Hadoop Summit
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
DataWorks Summit/Hadoop Summit
HBase in Practice
HBase in Practice
DataWorks Summit/Hadoop Summit
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
DataWorks Summit/Hadoop Summit
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
DataWorks Summit/Hadoop Summit
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
DataWorks Summit/Hadoop Summit
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
DataWorks Summit/Hadoop Summit
More from DataWorks Summit/Hadoop Summit
(20)
Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
Hadoop Crash Course
Hadoop Crash Course
Data Science Crash Course
Data Science Crash Course
Apache Spark Crash Course
Apache Spark Crash Course
Dataflow with Apache NiFi
Dataflow with Apache NiFi
Schema Registry - Set you Data Free
Schema Registry - Set you Data Free
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
HBase in Practice
HBase in Practice
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Recently uploaded
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
2toLead Limited
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
Pixlogix Infotech
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
Scott Keck-Warren
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
comworks
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
Allon Mureinik
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Malak Abu Hammad
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
BookNet Canada
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Delhi Call girls
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
null - The Open Security Community
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
Fwdays
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
BookNet Canada
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Puma Security, LLC
Key Features Of Token Development (1).pptx
Key Features Of Token Development (1).pptx
LBM Solutions
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
carlostorres15106
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Delhi Call girls
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
Sinan KOZAK
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
Neo4j
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
Scott Keck-Warren
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
Ridwan Fadjar
Recently uploaded
(20)
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Key Features Of Token Development (1).pptx
Key Features Of Token Development (1).pptx
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
Data Regions: Modernizing your company's data ecosystem
1.
C o p
yri g h t © 2 0 1 5 , S A S In s t i t u t e In c . A l l ri g h t s re s e rve d . 1 Data Regions: Modernizing Your Company’s Data Ecosystem Evan Levy Vice President, Data Management Programs SAS EvanJayLevy
2.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 2 A 20 Year Old Paradigm The Change Data Perspective Traditional Assumption All data originates from internal systems The company runs on OLTP systems Users have the BI/DW to address their reporting and analysis needs Users require data from many sources (and the quantity is growing) Business Operations rely on OLTP, Data, and Analytics The Data Warehouse is the data source Today’s RealityMost data is internal; >35% is external Today’s Reality We have multiple analytical systems: data mining, exploration, sandboxes, etc. 1339F9C1339F9C
3.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 3 Data Challenges… “Why is all the data put into the warehouse? Only 3 people need to use the data” “Can you tell me what data we purchased from outside vendors?” “Why will it take you 30 days to load data? I can cut and paste it into my server in 4 minutes.” “We have to standardize business terminology. We’ve learned that data governance is critical.” “Why do I have to work around the ‘infrastructure’. Shouldn’t it be built for my needs?” “You send me a file from SalesForce every month, and the layout changes every month. And you don’t tell me.” “We have data all over (systems, the cloud, external apps, etc.) Why don’t we have a catalog of the sources? “Finance wants all data reconciled. I can’t wait. Why do I have to suffer from their requirements?” 133A061
4.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 4 Data Characteristics Data Access Domain Structure Audience Integrity 1337ADC
5.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 5 Data Characteristics Audience The individual user (and their skills and data needs) Reviewing data about a known situations Report users DW Developers Uses ETL tools to retrieve and load data Analytic Developers Builds analytical models to manipulate known data Data Scientists Analyzes any available data to identify new trends BI Developers Building reports using structured data Business Analyst Analyzing data to for a new hypothesis Develops code to navigate any available data source Application Developers 1337ADC
6.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 6 A business analyst running a report on DBMS tables Data Characteristics Access Custom code navigating a flat file (to retrieve specific values) Code call platform specific APIs for data access The methods, interfaces, and tools used to access the data A cloud-application sending transactions SQL An application listening / receiving event streams A data scientist playing with data in a sandbox Access 1337ADC
7.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 7 Data Characteristics Structure Structured Data Semi Structured Data Unstructured Data The structure and organization of the data content 1337ADC
8.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 8 Enterprise Business Unit Data Characteristics Domain Organization Project Individual The business context for data usage1337ADC
9.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 9 Data Characteristics Integrity Client John Smith Username Oracleuser RequestDate 9/28/2000 Request Time 23:59:07 Status Code OK Browser Netscape 203.93.245.97 - oracleuser [28/Sep/2000:23:59:07 - 0700] "GET /files/search/search.jsp?s=driver&a=10 HTTP/1.0" 200 2374 "http://datawarehouse. oracle.co/contents.htm" "Mozilla/4.7 [en] (WinNT; I)" P;ECalibri;M220;SB;L10 P;ECalibri;M220;L11 P;ECalibri;M220;SI;L24 P;ECalibri;M220;SB;L9 P;ECalibri;M220;L10 P;ESegoe UI;M200;L9 P;ESegoe UI;M200;SB;L9 P;ECalibri;M180;L9 F;P0;DG0G8;M300 B;Y12;X5;D0 0 11 4 O;L;D;V0;K47;G100 0.001 F;M495;R1 F;SM24;Y1;X1 C;K"name" F;SM24;X2 C;K"Shares" F;SM24;X3 C;K"Quote/ Price" F;SM24;X4 C;K"cost/ share" F;SM24;X5 C;K"total cost" F;SM24;Y2;X1 C;K"aapl" F;P4;FF2G;SM24;X2 C;K1454.4024 F;SM24;X3 C;K126.85 F;SM24;X4 C;K79.006952 F;P4;FF2G;SM24;X5 C;K114907.9 F;SM24;Y3;X1 C;K"axp" F;P4;FF2G;SM24;X2 C;K1454.4108 F;SM24;X3 C;K79.27 F;SM24;X4 … name Shares Quote/ Price cost/ share total cost aapl 1,454.40 126.85 79.006952 114,907.90 axp 1,454.41 79.27 84.671889 123,147.71 bmy 3,666.51 63.95 43.25259 158,586.21 brk.b 1,000 143.46 119.3527 119,352.70 celg 1,000 116.44 102.47094 102,470.94 chl 500 71.4 71.4179 35,708.95 The format, typing, and accuracy of the data 1337ADC
10.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 10 The 5 Characteristics of Data Data Access Domain Structure Audience Integrity 1339F9C
11.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 11 Challenging the Existing Data Paradigm Support numerous new data sources Establish a shared source staging area Allow “trial & error” analysis for all users Support Self Service Data (ETL, report, analysis, etc.) Support different levels of data acceptance 1339F9C
12.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 12 Data Regions Internal Applications SourceData Repository Cloud Applications Data StreamsFiles Services Inbound Data Source Onboarding Sandbox Reporting & BI Enterprise View Data Exploration Advanced Analytics & Modeling Messages 133A061
13.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 13 Data Regions Addressing an Enterprise Data Need Internal Applications SourceData Repository Cloud Applications Data StreamsFiles Services Inbound Data Source Onboarding Sandbox Reporting & BI Enterprise View Data Exploration Advanced Analytics & Modeling Messages Create an environment that fits user needs (not IT convenience) Support data onboarding and distribution as a production need Support a diverse set of data usage needs Address the complexities of data movement Reduce resource/skill overlap across the company 133A061
14.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 14 Data Regions Source Onboarding Audience Source Onboarding developers only; receiving for Source Data repository Access Supports multiple delivery methods: txns, messages, bulk formats. Structure Data layout based on source system. Likely dynamic & volatile Domain N/A. This detail is implicit with the data source and the supplier. Integrity N/A. Data details are defined by the data supplier. • Manages the delivery of data from internal & external sources • Holds data until acceptance is complete; Data is then moved to the Source Data Repository • Centralized support for sophisticated data capture methods (ESP, 3rd party data delivery, API/messaging, etc.) • Productionalizes source data capture, identification and sharing 1339F9C
15.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 15 Data Regions Source Data Repository • Stores and retains all source data content; reduces enterprise storage requirements • Establishes centralized registry of available data sources. • Reflects a defined data layout (independent of source changes) • Alleviates developers’ need to learn data navigation, layout, naming conventions on dozens of source systems Audience Data Integration (Developers – DW, Application, Data Scientists, etc. ) Access Usually file oriented (transaction and other access based on situation) Structure Company-centric, documented layout; Incl structured & unstructured Domain N/A. Data reflects source Integrity Company-centric format; Data quality and accuracy not addressed.1339F9C
16.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 16 Data Regions Data Exploration • Supports one-off, in depth business analysis using any data ─ Environment is permanent but resource usage is very transient ─ Does not support production application access or deployment • Often a general purpose platform that can support numerous technologies (Big Data, files, RDBMS, advanced analytics, etc.) • A walled-off, protected data scientist-centric environment Audience Data Scientists & Analytics Developers (unable to be supported by sandbox) Access All access methods due to the “from scratch” nature of environment Structure All data layouts. (Unstructured likely due to focus on new concept development) Domain Typically enterprise or line of business level Integrity Data transformed/standardized to streamline exploration efforts (often ignored for new or unknown data sources)1339F9C
17.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 17 Data Regions Enterprise View • Contains multiple integrated subject areas (w/ long-term history) • Content reflects enterprise trusted (and corrected) data • Includes metadata (terms, definitions, lineage, etc.) • Supports query processing and data provisioning ─ Online end-user queries and reporting ─ Data provisioning to analytical and transactional systems ─ Content continually updated (where possible) Audience All user. Most access will occur via query tools or data manipulation/ETL tools Access Usually query-based access (w/existing tools). Unstructured requires APIs Structure Data is usually structured. (unstructured requires special tools/extensions Domain Enterprise level. Other domains may use content for provisioning purposes Integrity Reflective of enterprise terminology and value standards1339F9C
18.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 18 Data Regions Sandbox • Allowing users to extend their analysis with custom data ─ Supports structured data and queries using existing tools/technologies ─ Focused on supporting additional (external) data • Environment is temporary; does not support production ─ Walled-off environment; reports or data not distributable • Allows for business-level data discovery and exploration ─ Supports one-off user data needs Audience Advanced business users. Requites dbms query and data integration skills Access Data is accessible via SQL/table environment. Structure Data content is structured and RDBMS oriented (goal is data variety) Domain Any/All domains (enterprise to individual) Integrity Enterprise data is standardized/corrected. Other data must be addressed by user1337ADC1339F9C
19.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 19 Data Regions Reporting and Business Intelligence • Supports defined reporting and ad hoc analysis (departmental data marts) • Supports an application- or tool-centric view of data ─ Simplifies tool access and data manipulation, or ─ Reflects unique business (organization) view of data details • Requires additional technical staff resources ─ ETL processing for additional sources, aggregates, hierarchies, etc. ─ Query and usage support for non-enterprise data Audience Business users focused on using standard reports and content Access Usually SQL-based access. Some data may be tool-centric (e.g. OLAP cubes) Structure Usually structured data and reflecting rows of columns Domain Likely to use enterprise data. Additional data may reflect different structure or domain as needed. Integrity Enterprise data is standardized/corrected. Other data must be addressed by user1337ADC1339F9C
20.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 20 Data Regions Advanced Analytics & Modeling • A processing environment that can support advanced analytics ─ Typically general purpose processing platforms with inexpensive directly attached storage ─ Data is structured and often stored in highly denormalized structures ─ usually driven by a specialized tool or language • Typically small, high-value user audience • Production-supported environment. Data & Results are distributed Audience Highly skilled technical staff (data scientists, developers with advanced analysis skills) Access Data accessed via specialized tools using standard and custom access methods. Structure Data is usually structured; May process unstructured data into structured content Domain Typically enterprise-level data. Business drivers are often specific to organization Integrity Data is often cleansed and standardized 1339F9C
21.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 21 Data Services SourceData Repository Source Onboarding Sandbox Reporting & BI Enterprise View Data Exploration Advanced Analytics & Modeling Data Transformation Data Quality Data Governance Metadata 1339F9C
22.
Copyright © 2016,
SAS Institute Inc. All rights reserved. 22 Getting Started, Moving Forward… • Evaluate the diversity of audiences and domains − Understand the unique combinations – those dictate the complexity of your environment − Review the external data that is already in use • Extend your environment one region at a time − Focus on adding (or remediating) regions based on business need • Sharing data is not a courtesy – it’s a production need − Data provisioning and integration is a costly activity; it should be addressed with “economies-of-scale” methods − Establishing repositories (with card catalogs) to provide “raw” and “approved” data is a necessity 13378871339F9C
23.
Copyr ight ©
2016, SAS Institute Inc. All rights reser ved . THANKS! www.EvanJLevy.com@EvanJayLevy Evan.Levy@SAS.com
Download now