SlideShare a Scribd company logo
1 of 36
Ad-Hoc OLAP databases with Yertl and HANA
Radek Kotowicz (radoslaw.kotowicz@sap.com)
http://blogs.perl.org/users/radek_kotowicz
http://www.ariba.com/about/sap-ariba
© 2013 Ariba - an SAP company. All rights reserved. 2Public
OLTP
© 2013 Ariba - an SAP company. All rights reserved. 3Public
OLAP schema
© 2013 Ariba - an SAP company. All rights reserved. 4Public
Difference
OLTP OLAP
Normalization De-normalization
© 2013 Ariba - an SAP company. All rights reserved. 5Public
Consequences
OLTP OLAP
• Enforced integrity
• Easily extensible design
• Flexible analytics
• Slow for analytical queries
operating on large result
sets
Lack or loose integrity
Simple queries
Less flexible/extensible analytics
Performant analytics
Interface for operating on hyper-cubes
© 2013 SAP AG. All rights reserved. 6
© 2013 Ariba - an SAP company. All rights reserved. 7Public
Goal
1. Get a DB where analytical queries can be executed without impacting
transaction database
2. No setup
3. Need an analytical tool for non-tech users
4. Report needs to be quick
5. Dataloading time not crucial
© 2013 Ariba - an SAP company. All rights reserved. 8Public
RDBMS
JDBCJDBC
OLAP
cubes
JRuby
mondrian-olap
gem
MDX
A possible scenario
© 2013 Ariba - an SAP company. All rights reserved. 9Public
RDBMS
JDBCXMLA/
HTTP
OLAP
cubes
Mondrian-XML-A-
Consumer
wxWidgets
More Perl-aware …
© 2013 Ariba - an SAP company. All rights reserved. 10Public
RDBMS
JDBCXMLA/
HTTP
OLAP
cubes
Mondrian-XML-A-
Consumer
wxWidgets
Still not Perl-centric
Do I want to
build a pivot
table engine
from scratch?
SQL level filtering
too weak
No persistence
© 2013 Ariba - an SAP company. All rights reserved. 11Public
RDBMS
DBI/ODBCODBO
OLAP
cubesExcel
ETL::Yertl
Perl as an ETL tool + HANA DB + Excel for presentation
© 2013 Ariba - an SAP company. All rights reserved. 12Public
The main concepts of HANA
On current CPUs, we can
expect to process 1 MB per ms and with parallel processing
on 16 cores more than 10MB per ms. To put this into con-
text, to look for a single dimension compressed in 4 bytes,
we can scan 2.5 million tuples for qualification in 1 ms
© 2013 Ariba - an SAP company. All rights reserved. 13Public
Compression
Column data is of uniform type; therefore, there are some opportunities for storage
size optimizations available in column-oriented data that are not available in row-
oriented data.
© 2013 Ariba - an SAP company. All rights reserved. 14Public
Data Loading
© 2013 Ariba - an SAP company. All rights reserved. 15Public
Yertl
• yfrom - Build YAML from another format (like JSON or CSV)
• ygrok - Build YAML by parsing lines of plain text
• ysql - Query SQL databases in a Yertl workflow
• ymask - Mask a data structure to display only the desired fields
• yq - Filter YAML through a command-line program
• yto - Change YAML to another format (like JSON)
EXTRACT
FILTER /
TRANSFORM
LOAD
© 2013 SAP AG. All rights reserved. 16
© 2013 Ariba - an SAP company. All rights reserved. 17Public
How it works?
YAML
YAML
[ DBI ]
[ DBI ]
© 2013 Ariba - an SAP company. All rights reserved. 18Public
Get_recent_auctions.sql (source query file)
© 2013 Ariba - an SAP company. All rights reserved. 19Public
insert_auctions.hsql (target query file)
© 2013 Ariba - an SAP company. All rights reserved. 20Public
For complex filtering
© 2013 SAP AG. All rights reserved. 21
© 2013 Ariba - an SAP company. All rights reserved. 22Public
HCP architecture
© 2013 Ariba - an SAP company. All rights reserved. 23Public
What about performance of such ETL process?
If you kick-off a load the data with a single ysql into a trial HANA instance you
probably won't get a speed above 20-60k rows per hour…
© 2013 Ariba - an SAP company. All rights reserved. 24Public
What about performance of such ETL process?
If you kick-off a load the data with a single ysql into a trial HANA instance you
probably won't get a speed above 20-60k rows per hour… but we need to bear
in mind that:
1. Yertl runs DMLs one by one
2. Auto-commits
3. You're loading within one ODBC connection that is routed through one
TLS tunnel
4. There are some constraints imposed on the connections in the trial
instance
© 2013 Ariba - an SAP company. All rights reserved. 25Public
If that’s still too slow
• Data Services
• HANA studio import
© 2013 Ariba - an SAP company. All rights reserved. 26Public
HANA studio import
© 2013 Ariba - an SAP company. All rights reserved. 27Public
HANA studio import
© 2013 SAP AG. All rights reserved. 28
© 2013 Ariba - an SAP company. All rights reserved. 29Public
Defining views
• Attribute views (dimensions) – typically modelling entities such as product, user,
commodity etc
• Analytical views – facts surrounded by dimensions with some defined aggregates
• Calculation views – extension of analytical views e.g. for multi-fact reporting
© 2013 SAP AG. All rights reserved. 30
© 2013 SAP AG. All rights reserved. 31
© 2013 SAP AG. All rights reserved. 32
© 2013 SAP AG. All rights reserved. 33
© 2013 Ariba - an SAP company. All rights reserved. 34Public
„Good” ETL tool
© 2013 Ariba - an SAP company. All rights reserved. 35Public
References
• A Common Database Approach for OLTP and
OLAP Using an In-Memory Column Database,
Hasso Plattner Institute for IT Systems
Engineering University of Potsdam
• SAP HANA Essentials eBook, Jeffrey Word
http://saphanabook.com/
• CPAN: https://metacpan.org/release/ETL-Yertl
• Perl Blogs:
• http://blogs.perl.org/users/preaction/2015/01/man
aging-sql-data-with-yertl.html
• http://blogs.perl.org/users/radek_kotowicz/2015/0
8/moving-data-around-with-yertl-over-odbc-to-
hana.html
© 2013 Ariba - an SAP company. All rights reserved. 36Public
Q & A

More Related Content

What's hot

Building a Graph of all US Businesses Using Spark Technologies by Alexis Roos
Building a Graph of all US Businesses Using Spark Technologies by Alexis RoosBuilding a Graph of all US Businesses Using Spark Technologies by Alexis Roos
Building a Graph of all US Businesses Using Spark Technologies by Alexis Roos
Spark Summit
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
DataWorks Summit
 
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJAEvaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
DataWorks Summit
 

What's hot (20)

#BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask
#BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask #BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask
#BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask
 
Active Learning for Fraud Prevention
Active Learning for Fraud PreventionActive Learning for Fraud Prevention
Active Learning for Fraud Prevention
 
SAP Integrated Business Planning
SAP Integrated Business PlanningSAP Integrated Business Planning
SAP Integrated Business Planning
 
Building a Graph of all US Businesses Using Spark Technologies by Alexis Roos
Building a Graph of all US Businesses Using Spark Technologies by Alexis RoosBuilding a Graph of all US Businesses Using Spark Technologies by Alexis Roos
Building a Graph of all US Businesses Using Spark Technologies by Alexis Roos
 
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
 
The future of Essbase: Hybrid database format
The future of Essbase: Hybrid database formatThe future of Essbase: Hybrid database format
The future of Essbase: Hybrid database format
 
Splice machine-bloor-webinar-data-lakes
Splice machine-bloor-webinar-data-lakesSplice machine-bloor-webinar-data-lakes
Splice machine-bloor-webinar-data-lakes
 
Productionizing Spark ML pipelines with the portable format for analytics
Productionizing Spark ML pipelines with the portable format for analyticsProductionizing Spark ML pipelines with the portable format for analytics
Productionizing Spark ML pipelines with the portable format for analytics
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
 
Seminar on olap online analytical
Seminar on olap  online analyticalSeminar on olap  online analytical
Seminar on olap online analytical
 
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UNMariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
 
Building the Autodesk Design Graph-(Yotto Koga, Autodesk)
Building the Autodesk Design Graph-(Yotto Koga, Autodesk)Building the Autodesk Design Graph-(Yotto Koga, Autodesk)
Building the Autodesk Design Graph-(Yotto Koga, Autodesk)
 
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
 
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
 
2 - Trafodion and Hadoop HBase
2 - Trafodion and Hadoop HBase2 - Trafodion and Hadoop HBase
2 - Trafodion and Hadoop HBase
 
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJAEvaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
 
Making MySQL Great For Business Intelligence
Making MySQL Great For Business IntelligenceMaking MySQL Great For Business Intelligence
Making MySQL Great For Business Intelligence
 
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
 
Distributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On SparkDistributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On Spark
 
About CDAP
About CDAPAbout CDAP
About CDAP
 

Viewers also liked

第三代公路監理資訊系統建置概述
第三代公路監理資訊系統建置概述 第三代公路監理資訊系統建置概述
第三代公路監理資訊系統建置概述
Dylan Chiang, PMP, CISSP
 
Enbe Final Presentation Board
Enbe Final Presentation Board Enbe Final Presentation Board
Enbe Final Presentation Board
LY97
 
Global slide share
Global slide shareGlobal slide share
Global slide share
Ruchika786
 
Video juegos - Nrc 869
Video juegos - Nrc 869Video juegos - Nrc 869
Video juegos - Nrc 869
Steven Sdvsf
 
Bba 2 be ii u 4 the open economy macroeconomics
Bba 2 be ii u 4 the open economy macroeconomicsBba 2 be ii u 4 the open economy macroeconomics
Bba 2 be ii u 4 the open economy macroeconomics
Prof. Devrshi Upadhayay
 
Kids tablet from Kidzstar
Kids tablet from KidzstarKids tablet from Kidzstar
Kids tablet from Kidzstar
Kidzstar
 
HANA Playground Session_Latest
HANA Playground Session_LatestHANA Playground Session_Latest
HANA Playground Session_Latest
Abhishek Agrawal
 
CV-Safety Engineer - Ahmed Kaleem Shaik
CV-Safety Engineer - Ahmed Kaleem ShaikCV-Safety Engineer - Ahmed Kaleem Shaik
CV-Safety Engineer - Ahmed Kaleem Shaik
Ahmed Kaleem Shaik
 

Viewers also liked (20)

第三代公路監理資訊系統建置概述
第三代公路監理資訊系統建置概述 第三代公路監理資訊系統建置概述
第三代公路監理資訊系統建置概述
 
Propuesta casa javier nucamendi model
Propuesta casa javier nucamendi modelPropuesta casa javier nucamendi model
Propuesta casa javier nucamendi model
 
Social psychology project brief
Social psychology project briefSocial psychology project brief
Social psychology project brief
 
Sale20130528
Sale20130528Sale20130528
Sale20130528
 
Enbe Final Presentation Board
Enbe Final Presentation Board Enbe Final Presentation Board
Enbe Final Presentation Board
 
Global slide share
Global slide shareGlobal slide share
Global slide share
 
1050118臺北市政府衛生局104年肉加工食品檢出產品標示外動物性成分名冊
1050118臺北市政府衛生局104年肉加工食品檢出產品標示外動物性成分名冊1050118臺北市政府衛生局104年肉加工食品檢出產品標示外動物性成分名冊
1050118臺北市政府衛生局104年肉加工食品檢出產品標示外動物性成分名冊
 
20150606まちづくり提案事業報告会
20150606まちづくり提案事業報告会20150606まちづくり提案事業報告会
20150606まちづくり提案事業報告会
 
cv
cvcv
cv
 
Video juegos - Nrc 869
Video juegos - Nrc 869Video juegos - Nrc 869
Video juegos - Nrc 869
 
Freedom group 2
Freedom group 2Freedom group 2
Freedom group 2
 
Company profile PT.NJR INTI SELARAS
Company profile PT.NJR INTI SELARASCompany profile PT.NJR INTI SELARAS
Company profile PT.NJR INTI SELARAS
 
Bba 2 be ii u 4 the open economy macroeconomics
Bba 2 be ii u 4 the open economy macroeconomicsBba 2 be ii u 4 the open economy macroeconomics
Bba 2 be ii u 4 the open economy macroeconomics
 
Hana
HanaHana
Hana
 
Impact 3. Лидерство
Impact 3. ЛидерствоImpact 3. Лидерство
Impact 3. Лидерство
 
Kids tablet from Kidzstar
Kids tablet from KidzstarKids tablet from Kidzstar
Kids tablet from Kidzstar
 
HANA Playground Session_Latest
HANA Playground Session_LatestHANA Playground Session_Latest
HANA Playground Session_Latest
 
20 Steps to a Better 1st Board Meeting
20 Steps to a Better 1st Board Meeting20 Steps to a Better 1st Board Meeting
20 Steps to a Better 1st Board Meeting
 
CV-Safety Engineer - Ahmed Kaleem Shaik
CV-Safety Engineer - Ahmed Kaleem ShaikCV-Safety Engineer - Ahmed Kaleem Shaik
CV-Safety Engineer - Ahmed Kaleem Shaik
 
Sculpture Forever .
Sculpture Forever . Sculpture Forever .
Sculpture Forever .
 

Similar to Yertl v2 granada

A11,B24 次世代型インメモリデータベースSAP HANA。その最新技術を理解する by Toshiro Morisaki
A11,B24 次世代型インメモリデータベースSAP HANA。その最新技術を理解する by  Toshiro MorisakiA11,B24 次世代型インメモリデータベースSAP HANA。その最新技術を理解する by  Toshiro Morisaki
A11,B24 次世代型インメモリデータベースSAP HANA。その最新技術を理解する by Toshiro Morisaki
Insight Technology, Inc.
 

Similar to Yertl v2 granada (20)

Consolidate your SAP System landscape Teched && d-code 2014
Consolidate your SAP System landscape Teched && d-code 2014Consolidate your SAP System landscape Teched && d-code 2014
Consolidate your SAP System landscape Teched && d-code 2014
 
TDC2016SP - Trilha NoSQL
TDC2016SP - Trilha NoSQLTDC2016SP - Trilha NoSQL
TDC2016SP - Trilha NoSQL
 
データベースMeetup vol2
データベースMeetup vol2データベースMeetup vol2
データベースMeetup vol2
 
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
 
A11,B24 次世代型インメモリデータベースSAP HANA。その最新技術を理解する by Toshiro Morisaki
A11,B24 次世代型インメモリデータベースSAP HANA。その最新技術を理解する by  Toshiro MorisakiA11,B24 次世代型インメモリデータベースSAP HANA。その最新技術を理解する by  Toshiro Morisaki
A11,B24 次世代型インメモリデータベースSAP HANA。その最新技術を理解する by Toshiro Morisaki
 
GLOC Keynote 2014 - In-memory
GLOC Keynote 2014 - In-memoryGLOC Keynote 2014 - In-memory
GLOC Keynote 2014 - In-memory
 
Introduction to Property Graph Features (AskTOM Office Hours part 1)
Introduction to Property Graph Features (AskTOM Office Hours part 1) Introduction to Property Graph Features (AskTOM Office Hours part 1)
Introduction to Property Graph Features (AskTOM Office Hours part 1)
 
The SAP Startup Focus Program – Tackling Big Data With the Power of Small by ...
The SAP Startup Focus Program – Tackling Big Data With the Power of Small by ...The SAP Startup Focus Program – Tackling Big Data With the Power of Small by ...
The SAP Startup Focus Program – Tackling Big Data With the Power of Small by ...
 
Oracle GoldenGate Performance Tuning
Oracle GoldenGate Performance TuningOracle GoldenGate Performance Tuning
Oracle GoldenGate Performance Tuning
 
Autodesk Technical Webinar: SAP HANA in-memory database
Autodesk Technical Webinar: SAP HANA in-memory databaseAutodesk Technical Webinar: SAP HANA in-memory database
Autodesk Technical Webinar: SAP HANA in-memory database
 
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
 
Solution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataSolution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big Data
 
HANA SPS07 Extended Application Service
HANA SPS07 Extended Application ServiceHANA SPS07 Extended Application Service
HANA SPS07 Extended Application Service
 
IBM Cognos 10.x Components.pptx
IBM Cognos 10.x Components.pptxIBM Cognos 10.x Components.pptx
IBM Cognos 10.x Components.pptx
 
86921864 olap-case-study-vj
86921864 olap-case-study-vj86921864 olap-case-study-vj
86921864 olap-case-study-vj
 
Sap slt100 sps08 latest sample
Sap slt100 sps08 latest sampleSap slt100 sps08 latest sample
Sap slt100 sps08 latest sample
 
Sap ac100 col03 sf 1503 latest sample www erp_examscom
Sap ac100 col03 sf 1503 latest sample www erp_examscomSap ac100 col03 sf 1503 latest sample www erp_examscom
Sap ac100 col03 sf 1503 latest sample www erp_examscom
 
Intelligent Integration OOW2017 - Jeff Pollock
Intelligent Integration OOW2017 - Jeff PollockIntelligent Integration OOW2017 - Jeff Pollock
Intelligent Integration OOW2017 - Jeff Pollock
 
Oracle Data Integration - Overview
Oracle Data Integration - OverviewOracle Data Integration - Overview
Oracle Data Integration - Overview
 
Oracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorldOracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorld
 

Recently uploaded

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 

Recently uploaded (20)

How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 

Yertl v2 granada

  • 1. Ad-Hoc OLAP databases with Yertl and HANA Radek Kotowicz (radoslaw.kotowicz@sap.com) http://blogs.perl.org/users/radek_kotowicz http://www.ariba.com/about/sap-ariba
  • 2. © 2013 Ariba - an SAP company. All rights reserved. 2Public OLTP
  • 3. © 2013 Ariba - an SAP company. All rights reserved. 3Public OLAP schema
  • 4. © 2013 Ariba - an SAP company. All rights reserved. 4Public Difference OLTP OLAP Normalization De-normalization
  • 5. © 2013 Ariba - an SAP company. All rights reserved. 5Public Consequences OLTP OLAP • Enforced integrity • Easily extensible design • Flexible analytics • Slow for analytical queries operating on large result sets Lack or loose integrity Simple queries Less flexible/extensible analytics Performant analytics Interface for operating on hyper-cubes
  • 6. © 2013 SAP AG. All rights reserved. 6
  • 7. © 2013 Ariba - an SAP company. All rights reserved. 7Public Goal 1. Get a DB where analytical queries can be executed without impacting transaction database 2. No setup 3. Need an analytical tool for non-tech users 4. Report needs to be quick 5. Dataloading time not crucial
  • 8. © 2013 Ariba - an SAP company. All rights reserved. 8Public RDBMS JDBCJDBC OLAP cubes JRuby mondrian-olap gem MDX A possible scenario
  • 9. © 2013 Ariba - an SAP company. All rights reserved. 9Public RDBMS JDBCXMLA/ HTTP OLAP cubes Mondrian-XML-A- Consumer wxWidgets More Perl-aware …
  • 10. © 2013 Ariba - an SAP company. All rights reserved. 10Public RDBMS JDBCXMLA/ HTTP OLAP cubes Mondrian-XML-A- Consumer wxWidgets Still not Perl-centric Do I want to build a pivot table engine from scratch? SQL level filtering too weak No persistence
  • 11. © 2013 Ariba - an SAP company. All rights reserved. 11Public RDBMS DBI/ODBCODBO OLAP cubesExcel ETL::Yertl Perl as an ETL tool + HANA DB + Excel for presentation
  • 12. © 2013 Ariba - an SAP company. All rights reserved. 12Public The main concepts of HANA On current CPUs, we can expect to process 1 MB per ms and with parallel processing on 16 cores more than 10MB per ms. To put this into con- text, to look for a single dimension compressed in 4 bytes, we can scan 2.5 million tuples for qualification in 1 ms
  • 13. © 2013 Ariba - an SAP company. All rights reserved. 13Public Compression Column data is of uniform type; therefore, there are some opportunities for storage size optimizations available in column-oriented data that are not available in row- oriented data.
  • 14. © 2013 Ariba - an SAP company. All rights reserved. 14Public Data Loading
  • 15. © 2013 Ariba - an SAP company. All rights reserved. 15Public Yertl • yfrom - Build YAML from another format (like JSON or CSV) • ygrok - Build YAML by parsing lines of plain text • ysql - Query SQL databases in a Yertl workflow • ymask - Mask a data structure to display only the desired fields • yq - Filter YAML through a command-line program • yto - Change YAML to another format (like JSON) EXTRACT FILTER / TRANSFORM LOAD
  • 16. © 2013 SAP AG. All rights reserved. 16
  • 17. © 2013 Ariba - an SAP company. All rights reserved. 17Public How it works? YAML YAML [ DBI ] [ DBI ]
  • 18. © 2013 Ariba - an SAP company. All rights reserved. 18Public Get_recent_auctions.sql (source query file)
  • 19. © 2013 Ariba - an SAP company. All rights reserved. 19Public insert_auctions.hsql (target query file)
  • 20. © 2013 Ariba - an SAP company. All rights reserved. 20Public For complex filtering
  • 21. © 2013 SAP AG. All rights reserved. 21
  • 22. © 2013 Ariba - an SAP company. All rights reserved. 22Public HCP architecture
  • 23. © 2013 Ariba - an SAP company. All rights reserved. 23Public What about performance of such ETL process? If you kick-off a load the data with a single ysql into a trial HANA instance you probably won't get a speed above 20-60k rows per hour…
  • 24. © 2013 Ariba - an SAP company. All rights reserved. 24Public What about performance of such ETL process? If you kick-off a load the data with a single ysql into a trial HANA instance you probably won't get a speed above 20-60k rows per hour… but we need to bear in mind that: 1. Yertl runs DMLs one by one 2. Auto-commits 3. You're loading within one ODBC connection that is routed through one TLS tunnel 4. There are some constraints imposed on the connections in the trial instance
  • 25. © 2013 Ariba - an SAP company. All rights reserved. 25Public If that’s still too slow • Data Services • HANA studio import
  • 26. © 2013 Ariba - an SAP company. All rights reserved. 26Public HANA studio import
  • 27. © 2013 Ariba - an SAP company. All rights reserved. 27Public HANA studio import
  • 28. © 2013 SAP AG. All rights reserved. 28
  • 29. © 2013 Ariba - an SAP company. All rights reserved. 29Public Defining views • Attribute views (dimensions) – typically modelling entities such as product, user, commodity etc • Analytical views – facts surrounded by dimensions with some defined aggregates • Calculation views – extension of analytical views e.g. for multi-fact reporting
  • 30. © 2013 SAP AG. All rights reserved. 30
  • 31. © 2013 SAP AG. All rights reserved. 31
  • 32. © 2013 SAP AG. All rights reserved. 32
  • 33. © 2013 SAP AG. All rights reserved. 33
  • 34. © 2013 Ariba - an SAP company. All rights reserved. 34Public „Good” ETL tool
  • 35. © 2013 Ariba - an SAP company. All rights reserved. 35Public References • A Common Database Approach for OLTP and OLAP Using an In-Memory Column Database, Hasso Plattner Institute for IT Systems Engineering University of Potsdam • SAP HANA Essentials eBook, Jeffrey Word http://saphanabook.com/ • CPAN: https://metacpan.org/release/ETL-Yertl • Perl Blogs: • http://blogs.perl.org/users/preaction/2015/01/man aging-sql-data-with-yertl.html • http://blogs.perl.org/users/radek_kotowicz/2015/0 8/moving-data-around-with-yertl-over-odbc-to- hana.html
  • 36. © 2013 Ariba - an SAP company. All rights reserved. 36Public Q & A