SlideShare a Scribd company logo
1 of 33
Download to read offline
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
DATA MANAGEMENT FOR
HIGH-PERFORMANCE ANALYTICS
DAN SOCEANU
SENIOR SOLUTIONS ARCHITECT
DATA MANAGEMENT
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
BEFORE WE BEGIN SAS ACKNOWLEDGEMENTS
Ron Agresta, Product Director, Data Management
Lisa Dodson, Global Technology Practice Manager, Data Management
David Pope, Pre-Sales Manager, Energy & Manufacturing
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
DATA MANAGEMENT WHY ARE WE HERE?
• Data is rarely fit for analytic
purposes
• End-users are overwhelmed
o What data do I use?
o How do I load data?
o How can I find only the data I
need?
• Real-time needs
• The rise of “self-service
analytics”
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
CAN YOU LEVERAGE OPEN SOURCE
ANALYTICS?
CAN YOU
SCALE YOUR
DATA AND YOUR
ANALYTICS?
DO YOU GROW
A CULTURE OF
INNOVATION?
CAN YOU ANALYZE ALL
OF YOUR DATA?
CAN YOU MODERNIZE
YOUR LEGACY BI
STRATEGY?
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
Data Management for
High Performance
Analytics
0
IoT
Operational
Unstructured
Web
Text
Optimization
Forecasting
Mining
High Performance
Analytics
Data Sources
DATA MANAGEMENT BRIDGING THE GAP
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
Data Access Tier
Analytical Tier
Visualization Tier
Data Preparation Tier
Visualization
Analytics
Preparation
Access
DATA MANAGEMENT
CONVERGENCE OF DATA PREP, ANALYTICAL
PROCESSING AND PROVISIONING
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
DATA MANAGEMENT DATA FLOW FOR HIGH PERFORMANCE ANALYTICS
Data Management
Data
Warehouse
Dynamic
ReportingRead
ETL
Dynamic
Visualization
ACCESS
DataManagement
Analytical
Data
Warehouse
DataMonitoring
ExplorationQualityIntegration
MDM
Data
Marts
Model
Development
Operational
MQ
XML
Cloud
SOURCES
Repository
High
Performance
Analytics
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
ANALYTICS HISTORICAL VS. ADVANCED
Descriptive
 What happened?
 When?
 Why?
• Frequency
Distributions
• Correlation Measures
• Event Study
• Association Rules
Predictive
 What will happen?
 When?
 Why?
 How does that effect us?
 What actions should I
take?
• Estimation & Forecasting
• Segmentation
• Optimization
ANALYTICS
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
HIGH-PERFORMANCE
ANALYTICS
SAS SOLUTIONS
SAS High-Performance Data Mining
Predictive models using thousands of variables to produce more accurate and timely insights
SAS High-Performance Econometrics
Analytical models using complete data, not just a subset
SAS High-Performance Optimization
Model and solve optimization problems that are very large or cumbersome to solve
SAS High-Performance Statistics
Statistical models using big data to produce more accurate and timely insights
SAS High-Performance Text Mining
Better understand communications and create new value from big text data
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
HIGH-PERFORMANCE
ANALYTICS
SAS ANALYTIC PROCESSING APPROACHES
Traditional
Move data from source to the SAS server, process it and write back results (single server or SAS
Grid Manager)
In-Database
Move SAS processing to the data source and allow SAS processing to occur under the control of
the source environment (e.g. relational database or Hadoop). The analytic code executes in the
database process.
In-memory “Alongside” the Database
Move SAS processing to the data source but allow a SAS process to run "along-side”. The analytic
processes and the database processes are co-located and share resources.
In-memory “Next to” the Database
Move data from source to a dedicated SAS environment for processing. Does not require making
a physical copy of the data before processing and, once the processing is complete, the data is
not required to be kept in the dedicated SAS environment. This separates the resources
associated with data storage & processing and the SAS advanced analytical processing.
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
DATA MANAGEMENT VS. DATA PREPARATION?
Business Need
• Support analytical methods for decision
making, use cases and required actions
Data Governance
• Gap assessment; people, process and
technology
• Auditability, traceability, automated rules,
monitoring, collaboration
Productivity
• Data preparation, provisioning, reporting
DATA MANAGEMENT
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
DATA MANAGEMENT VS. DATA PREPARATION?
Business Need
• Support analytical methods for decision
making, use cases and required actions
Data Governance
• Gap assessment; people, process and
technology
• Auditability, traceability, automated rules,
monitoring, collaboration
Productivity
• Data preparation, provisioning, reporting
DATA MANAGEMENT DATA PREPARATION
Identify
• Profile
• Data types
• Numeric
• Character
• Contextual
• Cardinality
Access
• ETL
• Batch
• Real-time
• Latency
• Data Movement
• Connectivity
• Data Sources
Data Quality
• De-duplicate
• Standardize
• Missing values
• Imputation
• Enrich
• Binning
• Matching
• Identify
anomalies
Reshape
• Wide & flat
• Long & lean
• Transformation
logic
• Transpositions
• Frequency
analysis
• Appending data
• Partitioning
data
• Summarization
Metadata
• Lineage
• Semantic
glossary
• Data
relationships
• Impact analysis
• Hierarchy
management
• Collaboration
• Repeatability
• Entity
management
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
DATA MANAGEMENT THE ROLE OF DATA GOVERNANCE
Data Lifecycle
Reference and
Master Data
Data Security
Data
Architecture
Metadata Data Quality
Data
Administration
Data Warehousing
& BI/Analytics
DATA MANAGEMENT
DataStewardship
Roles&Tasks
Decision-making Bodies
Guiding Principles
Program Objectives
Decision Rights
DATA GOVERNANCE
DG without DM = only an academic exercise
DM without DG = the continued culture of “I know a guy”
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
DATA MANAGEMENT THE IMPORTANCE OF DATA GOVERNANCE
POSITIONS ENTERPRISE DATA ISSUES AS CROSS-FUNCTIONAL
• Establishes guiding principles for data sharing
• Eliminates data ownership issues and “turf wars”
• Ensures appropriate stakeholders have a say in decision making
ESTABLISHES BUSINESS STAKEHOLDERS AS INFORMATION OWNERS
• Aligns data policy with business strategies and priorities
• Aligns data quality with business measures and acceptance
• Helps to Identify ROI for data related activity
FORMALIZES DATA STEWARDSHIP
• Clarifies accountability for data definitions, rules, and quality
• Ensures data is managed separately from applications
• Formalizes monitoring and measurement of critical data
FOSTERS IMPROVED ALIGNMENT BETWEEN BUSINESS AND IT
• Links IT-driven data management activities with business unit activity
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
PARADIGM SHIFT
DATA PREPARATION IS ABOUT THE
BUSINESS NEED & USE CASE
80% 20%
Identify Access Data Quality Reshape Metadata Business Use
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
DATA PREPARATION FIVE KEY FOCUS AREAS
DATA PREPARATION
Identify
•Profile
•Data types
•Numeric
•Character
•Contextual
•Cardinality
Access
•ETL
•Batch
•Real-time
•Latency
•Data Movement
•Connectivity
•Data Sources
Data Quality
•De-duplicate
•Standardize
•Missing values
•Imputation
•Enrich
•Binning
•Matching
•Identify
anomalies
Reshape
•Wide & flat
•Long & lean
•Transformation
logic
•Transpositions
•Frequency
analysis
•Appending data
•Partitioning data
•Summarization
Metadata
•Lineage
•Semantic
glossary
•Data
relationships
•Impact analysis
•Hierarchy
management
•Collaboration
•Repeatability
•Entity
management
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
IDENTIFY WHAT DO I HAVE AND HOW USEFUL IS IT?
Is my data
consistent?
Is my data
complete?
Is my data
highly
unique?
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
IDENTIFY WHAT DO I HAVE AND HOW USEFUL IS IT?
Is my data
normal?
Is my data
linear?
What are the
associations in
the data?
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
ACCESS SO MANY DATA TYPES AND SOURCES
Access Excel SQLServer Oracle MySQL
Boolean Yes/No Bit Byte N/A Boolean
integer Number Int Number Int Int
float Number
(single)
Float Number Float Numeric
currency Currency Money NA NA Money
string NA Char Char Char Char
string Text VarChar VarChar VarChar VarChar
binary OLE Obj
Memo
Binary
Varbinary
Image
Long
Raw
Blob
Text
Binary
Varbinary
Copyr ight © 2012, SAS Institute Inc. All rights reser ved.
DATA QUALITY THE FOUNDATION
• Standardization
• Parsing
• Casing
• Identification
• De-duplication
• “Fuzzy” matching
• Clustering
• Entity resolution
• Survivorship
• Gender Analysis
• Locale Guessing
• Address Verification
• Address Enrichment (geocoding)
Business
Logic &
Rules
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
DATA QUALITY FILLING IN THE GAPS AND STANDARDIZING
Standardizing
Text
De-duplication
Standardizing Numeric
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
FILLING IN THE GAPS AND STANDARDIZING
Dropping outliers
Grouping or binning data
DATA QUALITY
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
RESHAPE FIT FOR PURPOSE?
Schema/view
Or
Flat Table?
Format of data
Data quality
dimensions?
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
RESHAPE FLATTENING THE DATA
• Efficient storage
• Fast retrieval
• Defined
schema
• WIDE tables /Time series data
• Iteration (build, test, repeat)
• Schema-less
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
RESHAPE SUMMARIZATION
Each product category will become its own row, with each
product purchased its own distinct category column.
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
RESHAPE TRANSPOSITION FOR DATA MINING
Add up the quantities for
each product purchased,
in each product category.
Copyright © 2013, SAS Institute Inc. All rights reserved.
METADATA MANAGE DATA HIERARCHIES AND RELATIONSHIPS
Customer
Types
Hierarchy
Coverage
Products
Financial
Accounts
Address
Inquiries
Product Party
Accounts
Transactions
Authorizations
Individual Organization
Inquiries
Loans
Terms
Collaterals
Ratings
External
Assets
Copyr ight © 2012, SAS Institute Inc. All rights reser ved.
METADATA ENTITY RESOLUTION
EMPLOYER_NA
ME_GRPID
EMPLOYER_NAME = Name of the client employer
(SOL0003n_Employer_Name)
cnt
28296ČESKOSLOVENSKÁ OBCHODNÍ BANKA, A. S 6
ČESKOSLOVENSKÁ OBCHODNÍ BANKA A.S. 182
ČSKOSLOVENSKÁ OBCHODNÍ BANKA A.S. 1
ČESKOSLOVENSKÁ OBCHODNÍ BANKA. A.S. 2
ČESKOSLOVENSKÁ OBCHODNÍ BANKA,A.S. 78
ČESKOSLOVENSKÁ OBCHODNÍ BANKA A. S. 9
ČESKOSLOVENSKÁ OBCHODNÍ BANKA ,A.S. 2
ČESKOSLOVENSKÁ OBCHODNÍ BANKA, A.S 6
ČESKOSLOVENSKÁ OBCHODNÍ BANKA, A.S . 1
ČESKOSLOVENSKÁ OBCHODNÍ BANKA, A.S. 717
ČESKOSLOVENSKUÁ OBCHODNÍ BANKA A.S. 1
ČESKOSLOVENSKÁ OBCHODNÍ BANKA A.S 1
ČESKOSLOVENSKÁ OBCHODNÍ BANKA, S.R.O. 3
ČESKOSLOVENSKÁ OBCHODNÍ BAŃKA, A.S. 1
ČESKOSLOVENSKÁ OBCHODNÍ BANKA, A. S. 587
ČESKOSLOVENSKÁOBCHODNÍBANKA, A.S. 1
ČESKOSLOVENSKÁ OBCHODNÍBANKA A.S. 1
ČESKOSLOVENSKÁ OBCHODNÍ BANLA 1
ČESKOSLOVENSKÁ OBCHODNÍ BÁNKA, A.S. 1
ČESKOSLOVENSKÁ OBCHODNÍ BANKA,A.S 2
ČESKOSLOVENSKÁ OBCHODNÍ BANKA 27
ČESKOSLOVENSKÁOBCHODNÍBANKA,A.S. 1
Example:
Entity Resolution
Employer Name
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
METADATA SEMANTIC RECONCILIATION AND BUSINESS GLOSSARY
Business Glossary and
Terms
Technical Architecture Diagram
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
METADATA LINEAGE & TRACEABILITY
A view into existing
data sources/targets,
jobs and the
associated ‘owners’
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
METADATA COLLABORATION AND REPEATABILITY
Collaboration
& Role-based
Dashboarding
Workflow & Data
Remediation
Process Orchestration
Unified Lineage
Job Monitoring
Copyr ight © 2016, SAS Institute Inc. All rights reser ved.
Decision MakingCustomer Focus
Compliance
Mandates
Mergers &
Acquisitions
At-Risk Projects
Operational
Efficiencies
CORPORATE DRIVERS
Data Quality
Data
Integration
Reference Data
Management
Master Data
Management
Data
Visualization
Data
Monitoring
Metadata
Management
Business
Glossary
SOLUTIONS
Data Lifecycle
Reference and
Master Data
Data Security
Data
Architecture
Metadata Data Quality
Data
Administration
Data Warehousing
& BI/Analytics
DATA MANAGEMENT
DataStewardship
Roles&Tasks
Decision-making Bodies
Guiding Principles
Program Objectives
Decision Rights
DATA GOVERNANCE
People
Process
Technology
METHODS
SAS DATA
MANAGEMENT
FRAMEWORK FOR SUCCESS
Data
Virtualization
Data Profiling
& Exploration
Copyright © 2013, SAS Institute Inc. All rights reserved.
QUESTIONS & ANSWERS THANK YOU!
DAN.SOCEANU@SAS.COM

More Related Content

What's hot

SAS Analytics In Action - The New BI
SAS Analytics In Action - The New BISAS Analytics In Action - The New BI
SAS Analytics In Action - The New BISAS Canada
 
The Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data GovernanceThe Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data GovernanceEric Kavanagh
 
You're the New CDO, Now What?
You're the New CDO, Now What?You're the New CDO, Now What?
You're the New CDO, Now What?Caserta
 
SAS Presentation
SAS PresentationSAS Presentation
SAS PresentationKali Howard
 
Combining SAS Office Analytics, SAS Visual Analytics, and SAS Studio.
Combining SAS Office Analytics, SAS Visual Analytics, and SAS Studio.Combining SAS Office Analytics, SAS Visual Analytics, and SAS Studio.
Combining SAS Office Analytics, SAS Visual Analytics, and SAS Studio.SAS Canada
 
Architecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the EnterpriseArchitecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the EnterpriseAmazon Web Services
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseDataWorks Summit
 
SAS - Visual Analytics a živá ukázka
SAS - Visual Analytics a živá ukázkaSAS - Visual Analytics a živá ukázka
SAS - Visual Analytics a živá ukázkaMarketingArrowECS_CZ
 
Choosing the Right Database for My Workload: Purpose-Built Databases
Choosing the Right Database for My Workload: Purpose-Built Databases Choosing the Right Database for My Workload: Purpose-Built Databases
Choosing the Right Database for My Workload: Purpose-Built Databases AWS Germany
 
451 Research Report on Avalon Big Data Capabilities - 2017
451 Research Report on Avalon Big Data Capabilities - 2017451 Research Report on Avalon Big Data Capabilities - 2017
451 Research Report on Avalon Big Data Capabilities - 2017Tom Reidy
 
Paraccel/Database Architechs Press Release
Paraccel/Database Architechs Press ReleaseParaccel/Database Architechs Press Release
Paraccel/Database Architechs Press ReleaseDatabase Architechs
 
The Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data IntegrationThe Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data IntegrationEric Kavanagh
 
Pentaho Healthcare Solutions
Pentaho Healthcare SolutionsPentaho Healthcare Solutions
Pentaho Healthcare SolutionsPentaho
 
JSBI Presentation Big Data Hyperion OBIEE Integration16 2
JSBI Presentation Big Data Hyperion OBIEE Integration16 2JSBI Presentation Big Data Hyperion OBIEE Integration16 2
JSBI Presentation Big Data Hyperion OBIEE Integration16 2Jeff Shauer
 
Data donderdag data quality sas
Data donderdag data quality sasData donderdag data quality sas
Data donderdag data quality sasCre-Aid
 

What's hot (19)

SAS Visual Analytics Overview
SAS Visual Analytics OverviewSAS Visual Analytics Overview
SAS Visual Analytics Overview
 
SAS Analytics In Action - The New BI
SAS Analytics In Action - The New BISAS Analytics In Action - The New BI
SAS Analytics In Action - The New BI
 
The Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data GovernanceThe Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data Governance
 
You're the New CDO, Now What?
You're the New CDO, Now What?You're the New CDO, Now What?
You're the New CDO, Now What?
 
SAS Presentation
SAS PresentationSAS Presentation
SAS Presentation
 
Combining SAS Office Analytics, SAS Visual Analytics, and SAS Studio.
Combining SAS Office Analytics, SAS Visual Analytics, and SAS Studio.Combining SAS Office Analytics, SAS Visual Analytics, and SAS Studio.
Combining SAS Office Analytics, SAS Visual Analytics, and SAS Studio.
 
Architecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the EnterpriseArchitecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the Enterprise
 
Big Data Services at YASH
Big Data Services at YASHBig Data Services at YASH
Big Data Services at YASH
 
SAS Visual Analytics
SAS Visual AnalyticsSAS Visual Analytics
SAS Visual Analytics
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
 
SAS - Visual Analytics a živá ukázka
SAS - Visual Analytics a živá ukázkaSAS - Visual Analytics a živá ukázka
SAS - Visual Analytics a živá ukázka
 
Choosing the Right Database for My Workload: Purpose-Built Databases
Choosing the Right Database for My Workload: Purpose-Built Databases Choosing the Right Database for My Workload: Purpose-Built Databases
Choosing the Right Database for My Workload: Purpose-Built Databases
 
451 Research Report on Avalon Big Data Capabilities - 2017
451 Research Report on Avalon Big Data Capabilities - 2017451 Research Report on Avalon Big Data Capabilities - 2017
451 Research Report on Avalon Big Data Capabilities - 2017
 
Paraccel/Database Architechs Press Release
Paraccel/Database Architechs Press ReleaseParaccel/Database Architechs Press Release
Paraccel/Database Architechs Press Release
 
The Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data IntegrationThe Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data Integration
 
Pentaho Healthcare Solutions
Pentaho Healthcare SolutionsPentaho Healthcare Solutions
Pentaho Healthcare Solutions
 
JSBI Presentation Big Data Hyperion OBIEE Integration16 2
JSBI Presentation Big Data Hyperion OBIEE Integration16 2JSBI Presentation Big Data Hyperion OBIEE Integration16 2
JSBI Presentation Big Data Hyperion OBIEE Integration16 2
 
Oracle big data discovery 994294
Oracle big data discovery   994294Oracle big data discovery   994294
Oracle big data discovery 994294
 
Data donderdag data quality sas
Data donderdag data quality sasData donderdag data quality sas
Data donderdag data quality sas
 

Viewers also liked

Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...Principled Technologies
 
Install SAS 9.2 presentation
Install SAS 9.2 presentationInstall SAS 9.2 presentation
Install SAS 9.2 presentationShane Gibson
 
Introduction To Sas
Introduction To SasIntroduction To Sas
Introduction To Sashalasti
 
SAS and Netezza Enzee universe presentation_20_june2011
SAS and Netezza Enzee universe presentation_20_june2011SAS and Netezza Enzee universe presentation_20_june2011
SAS and Netezza Enzee universe presentation_20_june2011Pavel Zhivulin
 
Migrating To SAS 9.2 by Bill Gibson
Migrating To SAS 9.2 by Bill GibsonMigrating To SAS 9.2 by Bill Gibson
Migrating To SAS 9.2 by Bill Gibsonsimienc
 
Netezza integration with SAS software
Netezza integration with SAS softwareNetezza integration with SAS software
Netezza integration with SAS softwarePavel Zhivulin
 
Administrative Reporting of SAS Visual Analytics 7.1 and Integration with E...
Administrative Reporting of SAS Visual Analytics 7.1  and Integration with  E...Administrative Reporting of SAS Visual Analytics 7.1  and Integration with  E...
Administrative Reporting of SAS Visual Analytics 7.1 and Integration with E...Francesco Marelli
 
Sas visual-analytics-startup-guide
Sas visual-analytics-startup-guideSas visual-analytics-startup-guide
Sas visual-analytics-startup-guideCMR WORLD TECH
 
Understanding SAS Data Step Processing
Understanding SAS Data Step ProcessingUnderstanding SAS Data Step Processing
Understanding SAS Data Step Processingguest2160992
 
Machine learning overview (with SAS software)
Machine learning overview (with SAS software)Machine learning overview (with SAS software)
Machine learning overview (with SAS software)Longhow Lam
 
Basics Of SAS Programming Language
Basics Of SAS Programming LanguageBasics Of SAS Programming Language
Basics Of SAS Programming Languageguest2160992
 
SAS MDM TRAINING ,SAS MDM SYLLABUS
SAS MDM TRAINING ,SAS MDM SYLLABUSSAS MDM TRAINING ,SAS MDM SYLLABUS
SAS MDM TRAINING ,SAS MDM SYLLABUSbidwhm
 

Viewers also liked (20)

Partnership checklist
Partnership checklistPartnership checklist
Partnership checklist
 
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...
 
Install SAS 9.2 presentation
Install SAS 9.2 presentationInstall SAS 9.2 presentation
Install SAS 9.2 presentation
 
SAS Modernization Webinar
SAS Modernization WebinarSAS Modernization Webinar
SAS Modernization Webinar
 
Introduction To Sas
Introduction To SasIntroduction To Sas
Introduction To Sas
 
SAS and Netezza Enzee universe presentation_20_june2011
SAS and Netezza Enzee universe presentation_20_june2011SAS and Netezza Enzee universe presentation_20_june2011
SAS and Netezza Enzee universe presentation_20_june2011
 
Migrating To SAS 9.2 by Bill Gibson
Migrating To SAS 9.2 by Bill GibsonMigrating To SAS 9.2 by Bill Gibson
Migrating To SAS 9.2 by Bill Gibson
 
Netezza integration with SAS software
Netezza integration with SAS softwareNetezza integration with SAS software
Netezza integration with SAS software
 
Administrative Reporting of SAS Visual Analytics 7.1 and Integration with E...
Administrative Reporting of SAS Visual Analytics 7.1  and Integration with  E...Administrative Reporting of SAS Visual Analytics 7.1  and Integration with  E...
Administrative Reporting of SAS Visual Analytics 7.1 and Integration with E...
 
Sas Grid Migration and Roadmap
Sas Grid Migration and RoadmapSas Grid Migration and Roadmap
Sas Grid Migration and Roadmap
 
Proc sql tips
Proc sql tipsProc sql tips
Proc sql tips
 
Sas Presentation
Sas PresentationSas Presentation
Sas Presentation
 
SAS/Tableau integration
SAS/Tableau integrationSAS/Tableau integration
SAS/Tableau integration
 
Sas visual-analytics-startup-guide
Sas visual-analytics-startup-guideSas visual-analytics-startup-guide
Sas visual-analytics-startup-guide
 
SAS Proc SQL
SAS Proc SQLSAS Proc SQL
SAS Proc SQL
 
Understanding SAS Data Step Processing
Understanding SAS Data Step ProcessingUnderstanding SAS Data Step Processing
Understanding SAS Data Step Processing
 
Sas demo
Sas demoSas demo
Sas demo
 
Machine learning overview (with SAS software)
Machine learning overview (with SAS software)Machine learning overview (with SAS software)
Machine learning overview (with SAS software)
 
Basics Of SAS Programming Language
Basics Of SAS Programming LanguageBasics Of SAS Programming Language
Basics Of SAS Programming Language
 
SAS MDM TRAINING ,SAS MDM SYLLABUS
SAS MDM TRAINING ,SAS MDM SYLLABUSSAS MDM TRAINING ,SAS MDM SYLLABUS
SAS MDM TRAINING ,SAS MDM SYLLABUS
 

Similar to Data Management for High Performance Analytics

What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It? Caserta
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureCaserta
 
DAS Slides: Metadata Management From Technical Architecture & Business Techni...
DAS Slides: Metadata Management From Technical Architecture & Business Techni...DAS Slides: Metadata Management From Technical Architecture & Business Techni...
DAS Slides: Metadata Management From Technical Architecture & Business Techni...DATAVERSITY
 
Cloud and Analytics - From Platforms to an Ecosystem
Cloud and Analytics - From Platforms to an EcosystemCloud and Analytics - From Platforms to an Ecosystem
Cloud and Analytics - From Platforms to an EcosystemDatabricks
 
Cloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitCloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitMing Yuan
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceCaserta
 
Why an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business SuccessWhy an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business SuccessInformatica
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneySai Paravastu
 
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?Albert Hoitingh
 
Setting Up the Data Lake
Setting Up the Data LakeSetting Up the Data Lake
Setting Up the Data LakeCaserta
 
The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation Caserta
 
Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Caserta
 
Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeCaserta
 
Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRyan Andhavarapu
 
Unleashing the Power of your Data
Unleashing the Power of your DataUnleashing the Power of your Data
Unleashing the Power of your DataItai Yaffe
 
SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...
SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...
SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...SAS Italy
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopDatameer
 
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Denodo
 
Bigdata and Analytics Services - Clover Infotech
Bigdata and Analytics Services - Clover InfotechBigdata and Analytics Services - Clover Infotech
Bigdata and Analytics Services - Clover InfotechSwetha Elias
 

Similar to Data Management for High Performance Analytics (20)

What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It?
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic Architecture
 
DAS Slides: Metadata Management From Technical Architecture & Business Techni...
DAS Slides: Metadata Management From Technical Architecture & Business Techni...DAS Slides: Metadata Management From Technical Architecture & Business Techni...
DAS Slides: Metadata Management From Technical Architecture & Business Techni...
 
Cloud and Analytics - From Platforms to an Ecosystem
Cloud and Analytics - From Platforms to an EcosystemCloud and Analytics - From Platforms to an Ecosystem
Cloud and Analytics - From Platforms to an Ecosystem
 
Cloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitCloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummit
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Why an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business SuccessWhy an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business Success
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
 
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
 
Setting Up the Data Lake
Setting Up the Data LakeSetting Up the Data Lake
Setting Up the Data Lake
 
The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation
 
Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics
 
Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data Lake
 
Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data Warehouse
 
Unleashing the Power of your Data
Unleashing the Power of your DataUnleashing the Power of your Data
Unleashing the Power of your Data
 
SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...
SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...
SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & Hadoop
 
Business Visualization: Dashboard & Storyboarding
Business Visualization: Dashboard & StoryboardingBusiness Visualization: Dashboard & Storyboarding
Business Visualization: Dashboard & Storyboarding
 
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
 
Bigdata and Analytics Services - Clover Infotech
Bigdata and Analytics Services - Clover InfotechBigdata and Analytics Services - Clover Infotech
Bigdata and Analytics Services - Clover Infotech
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 

Data Management for High Performance Analytics

  • 1. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. DATA MANAGEMENT FOR HIGH-PERFORMANCE ANALYTICS DAN SOCEANU SENIOR SOLUTIONS ARCHITECT DATA MANAGEMENT
  • 2. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. BEFORE WE BEGIN SAS ACKNOWLEDGEMENTS Ron Agresta, Product Director, Data Management Lisa Dodson, Global Technology Practice Manager, Data Management David Pope, Pre-Sales Manager, Energy & Manufacturing
  • 3. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. DATA MANAGEMENT WHY ARE WE HERE? • Data is rarely fit for analytic purposes • End-users are overwhelmed o What data do I use? o How do I load data? o How can I find only the data I need? • Real-time needs • The rise of “self-service analytics”
  • 4. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. CAN YOU LEVERAGE OPEN SOURCE ANALYTICS? CAN YOU SCALE YOUR DATA AND YOUR ANALYTICS? DO YOU GROW A CULTURE OF INNOVATION? CAN YOU ANALYZE ALL OF YOUR DATA? CAN YOU MODERNIZE YOUR LEGACY BI STRATEGY?
  • 5. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. Data Management for High Performance Analytics 0 IoT Operational Unstructured Web Text Optimization Forecasting Mining High Performance Analytics Data Sources DATA MANAGEMENT BRIDGING THE GAP
  • 6. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. Data Access Tier Analytical Tier Visualization Tier Data Preparation Tier Visualization Analytics Preparation Access DATA MANAGEMENT CONVERGENCE OF DATA PREP, ANALYTICAL PROCESSING AND PROVISIONING
  • 7. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. DATA MANAGEMENT DATA FLOW FOR HIGH PERFORMANCE ANALYTICS Data Management Data Warehouse Dynamic ReportingRead ETL Dynamic Visualization ACCESS DataManagement Analytical Data Warehouse DataMonitoring ExplorationQualityIntegration MDM Data Marts Model Development Operational MQ XML Cloud SOURCES Repository High Performance Analytics
  • 8. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. ANALYTICS HISTORICAL VS. ADVANCED Descriptive  What happened?  When?  Why? • Frequency Distributions • Correlation Measures • Event Study • Association Rules Predictive  What will happen?  When?  Why?  How does that effect us?  What actions should I take? • Estimation & Forecasting • Segmentation • Optimization ANALYTICS
  • 9. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. HIGH-PERFORMANCE ANALYTICS SAS SOLUTIONS SAS High-Performance Data Mining Predictive models using thousands of variables to produce more accurate and timely insights SAS High-Performance Econometrics Analytical models using complete data, not just a subset SAS High-Performance Optimization Model and solve optimization problems that are very large or cumbersome to solve SAS High-Performance Statistics Statistical models using big data to produce more accurate and timely insights SAS High-Performance Text Mining Better understand communications and create new value from big text data
  • 10. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. HIGH-PERFORMANCE ANALYTICS SAS ANALYTIC PROCESSING APPROACHES Traditional Move data from source to the SAS server, process it and write back results (single server or SAS Grid Manager) In-Database Move SAS processing to the data source and allow SAS processing to occur under the control of the source environment (e.g. relational database or Hadoop). The analytic code executes in the database process. In-memory “Alongside” the Database Move SAS processing to the data source but allow a SAS process to run "along-side”. The analytic processes and the database processes are co-located and share resources. In-memory “Next to” the Database Move data from source to a dedicated SAS environment for processing. Does not require making a physical copy of the data before processing and, once the processing is complete, the data is not required to be kept in the dedicated SAS environment. This separates the resources associated with data storage & processing and the SAS advanced analytical processing.
  • 11. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. DATA MANAGEMENT VS. DATA PREPARATION? Business Need • Support analytical methods for decision making, use cases and required actions Data Governance • Gap assessment; people, process and technology • Auditability, traceability, automated rules, monitoring, collaboration Productivity • Data preparation, provisioning, reporting DATA MANAGEMENT
  • 12. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. DATA MANAGEMENT VS. DATA PREPARATION? Business Need • Support analytical methods for decision making, use cases and required actions Data Governance • Gap assessment; people, process and technology • Auditability, traceability, automated rules, monitoring, collaboration Productivity • Data preparation, provisioning, reporting DATA MANAGEMENT DATA PREPARATION Identify • Profile • Data types • Numeric • Character • Contextual • Cardinality Access • ETL • Batch • Real-time • Latency • Data Movement • Connectivity • Data Sources Data Quality • De-duplicate • Standardize • Missing values • Imputation • Enrich • Binning • Matching • Identify anomalies Reshape • Wide & flat • Long & lean • Transformation logic • Transpositions • Frequency analysis • Appending data • Partitioning data • Summarization Metadata • Lineage • Semantic glossary • Data relationships • Impact analysis • Hierarchy management • Collaboration • Repeatability • Entity management
  • 13. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. DATA MANAGEMENT THE ROLE OF DATA GOVERNANCE Data Lifecycle Reference and Master Data Data Security Data Architecture Metadata Data Quality Data Administration Data Warehousing & BI/Analytics DATA MANAGEMENT DataStewardship Roles&Tasks Decision-making Bodies Guiding Principles Program Objectives Decision Rights DATA GOVERNANCE DG without DM = only an academic exercise DM without DG = the continued culture of “I know a guy”
  • 14. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. DATA MANAGEMENT THE IMPORTANCE OF DATA GOVERNANCE POSITIONS ENTERPRISE DATA ISSUES AS CROSS-FUNCTIONAL • Establishes guiding principles for data sharing • Eliminates data ownership issues and “turf wars” • Ensures appropriate stakeholders have a say in decision making ESTABLISHES BUSINESS STAKEHOLDERS AS INFORMATION OWNERS • Aligns data policy with business strategies and priorities • Aligns data quality with business measures and acceptance • Helps to Identify ROI for data related activity FORMALIZES DATA STEWARDSHIP • Clarifies accountability for data definitions, rules, and quality • Ensures data is managed separately from applications • Formalizes monitoring and measurement of critical data FOSTERS IMPROVED ALIGNMENT BETWEEN BUSINESS AND IT • Links IT-driven data management activities with business unit activity
  • 15. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. PARADIGM SHIFT DATA PREPARATION IS ABOUT THE BUSINESS NEED & USE CASE 80% 20% Identify Access Data Quality Reshape Metadata Business Use
  • 16. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. DATA PREPARATION FIVE KEY FOCUS AREAS DATA PREPARATION Identify •Profile •Data types •Numeric •Character •Contextual •Cardinality Access •ETL •Batch •Real-time •Latency •Data Movement •Connectivity •Data Sources Data Quality •De-duplicate •Standardize •Missing values •Imputation •Enrich •Binning •Matching •Identify anomalies Reshape •Wide & flat •Long & lean •Transformation logic •Transpositions •Frequency analysis •Appending data •Partitioning data •Summarization Metadata •Lineage •Semantic glossary •Data relationships •Impact analysis •Hierarchy management •Collaboration •Repeatability •Entity management
  • 17. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. IDENTIFY WHAT DO I HAVE AND HOW USEFUL IS IT? Is my data consistent? Is my data complete? Is my data highly unique?
  • 18. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. IDENTIFY WHAT DO I HAVE AND HOW USEFUL IS IT? Is my data normal? Is my data linear? What are the associations in the data?
  • 19. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. ACCESS SO MANY DATA TYPES AND SOURCES Access Excel SQLServer Oracle MySQL Boolean Yes/No Bit Byte N/A Boolean integer Number Int Number Int Int float Number (single) Float Number Float Numeric currency Currency Money NA NA Money string NA Char Char Char Char string Text VarChar VarChar VarChar VarChar binary OLE Obj Memo Binary Varbinary Image Long Raw Blob Text Binary Varbinary
  • 20. Copyr ight © 2012, SAS Institute Inc. All rights reser ved. DATA QUALITY THE FOUNDATION • Standardization • Parsing • Casing • Identification • De-duplication • “Fuzzy” matching • Clustering • Entity resolution • Survivorship • Gender Analysis • Locale Guessing • Address Verification • Address Enrichment (geocoding) Business Logic & Rules
  • 21. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. DATA QUALITY FILLING IN THE GAPS AND STANDARDIZING Standardizing Text De-duplication Standardizing Numeric
  • 22. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. FILLING IN THE GAPS AND STANDARDIZING Dropping outliers Grouping or binning data DATA QUALITY
  • 23. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. RESHAPE FIT FOR PURPOSE? Schema/view Or Flat Table? Format of data Data quality dimensions?
  • 24. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. RESHAPE FLATTENING THE DATA • Efficient storage • Fast retrieval • Defined schema • WIDE tables /Time series data • Iteration (build, test, repeat) • Schema-less
  • 25. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. RESHAPE SUMMARIZATION Each product category will become its own row, with each product purchased its own distinct category column.
  • 26. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. RESHAPE TRANSPOSITION FOR DATA MINING Add up the quantities for each product purchased, in each product category.
  • 27. Copyright © 2013, SAS Institute Inc. All rights reserved. METADATA MANAGE DATA HIERARCHIES AND RELATIONSHIPS Customer Types Hierarchy Coverage Products Financial Accounts Address Inquiries Product Party Accounts Transactions Authorizations Individual Organization Inquiries Loans Terms Collaterals Ratings External Assets
  • 28. Copyr ight © 2012, SAS Institute Inc. All rights reser ved. METADATA ENTITY RESOLUTION EMPLOYER_NA ME_GRPID EMPLOYER_NAME = Name of the client employer (SOL0003n_Employer_Name) cnt 28296ČESKOSLOVENSKÁ OBCHODNÍ BANKA, A. S 6 ČESKOSLOVENSKÁ OBCHODNÍ BANKA A.S. 182 ČSKOSLOVENSKÁ OBCHODNÍ BANKA A.S. 1 ČESKOSLOVENSKÁ OBCHODNÍ BANKA. A.S. 2 ČESKOSLOVENSKÁ OBCHODNÍ BANKA,A.S. 78 ČESKOSLOVENSKÁ OBCHODNÍ BANKA A. S. 9 ČESKOSLOVENSKÁ OBCHODNÍ BANKA ,A.S. 2 ČESKOSLOVENSKÁ OBCHODNÍ BANKA, A.S 6 ČESKOSLOVENSKÁ OBCHODNÍ BANKA, A.S . 1 ČESKOSLOVENSKÁ OBCHODNÍ BANKA, A.S. 717 ČESKOSLOVENSKUÁ OBCHODNÍ BANKA A.S. 1 ČESKOSLOVENSKÁ OBCHODNÍ BANKA A.S 1 ČESKOSLOVENSKÁ OBCHODNÍ BANKA, S.R.O. 3 ČESKOSLOVENSKÁ OBCHODNÍ BAŃKA, A.S. 1 ČESKOSLOVENSKÁ OBCHODNÍ BANKA, A. S. 587 ČESKOSLOVENSKÁOBCHODNÍBANKA, A.S. 1 ČESKOSLOVENSKÁ OBCHODNÍBANKA A.S. 1 ČESKOSLOVENSKÁ OBCHODNÍ BANLA 1 ČESKOSLOVENSKÁ OBCHODNÍ BÁNKA, A.S. 1 ČESKOSLOVENSKÁ OBCHODNÍ BANKA,A.S 2 ČESKOSLOVENSKÁ OBCHODNÍ BANKA 27 ČESKOSLOVENSKÁOBCHODNÍBANKA,A.S. 1 Example: Entity Resolution Employer Name
  • 29. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. METADATA SEMANTIC RECONCILIATION AND BUSINESS GLOSSARY Business Glossary and Terms Technical Architecture Diagram
  • 30. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. METADATA LINEAGE & TRACEABILITY A view into existing data sources/targets, jobs and the associated ‘owners’
  • 31. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. METADATA COLLABORATION AND REPEATABILITY Collaboration & Role-based Dashboarding Workflow & Data Remediation Process Orchestration Unified Lineage Job Monitoring
  • 32. Copyr ight © 2016, SAS Institute Inc. All rights reser ved. Decision MakingCustomer Focus Compliance Mandates Mergers & Acquisitions At-Risk Projects Operational Efficiencies CORPORATE DRIVERS Data Quality Data Integration Reference Data Management Master Data Management Data Visualization Data Monitoring Metadata Management Business Glossary SOLUTIONS Data Lifecycle Reference and Master Data Data Security Data Architecture Metadata Data Quality Data Administration Data Warehousing & BI/Analytics DATA MANAGEMENT DataStewardship Roles&Tasks Decision-making Bodies Guiding Principles Program Objectives Decision Rights DATA GOVERNANCE People Process Technology METHODS SAS DATA MANAGEMENT FRAMEWORK FOR SUCCESS Data Virtualization Data Profiling & Exploration
  • 33. Copyright © 2013, SAS Institute Inc. All rights reserved. QUESTIONS & ANSWERS THANK YOU! DAN.SOCEANU@SAS.COM