SlideShare a Scribd company logo
TIM: Large-scale Forecasting
for Energy in Julia
Ján Dolinský, Tangent Works #PyDataBA19. 2. 2018
(PyData Bratislava Meetup #6, Nervosa)
The Two Language Problem
The Philosofical Part
Ján Dolinský
jan.dolinsky@tangent.works
Math & Stats
Information Geometry
Computer
Science
Domain
Expertise
Machine
Learning
Traditional
Software
Traditional
Research
VALUE
Math & Stats
Information Geometry
Computer
Science
Domain
Expertise
Machine
Learning
Traditional
Software
Traditional
Research
VALUE
Data Science Stack:
R, Matlab, Python, Ruby, ...
Production Stack:
C++, Java, Fortran, ...
The Two Language Problem
Prototyping Production
Mathematicians & Data Scientists
Getting the applied math right.
Performance engineers & Architects.
The Two Language Problem: Tangent Works in 2014
Prototyping Production
The Data Science Stack:
R, Matlab, Octave, Python, Ruby, ….
Production stack:
C++, Fortran, Java, ...
A lot of effort ...
The Two Language Problem: Tangent Works in 2018
Prototyping Production
Julia
Julia
Mathematicians
Getting the applied math right.
Performance engineers & Architects.
TIM: Automatic Model Building for
Time-series in Energy
The Product Part
Ján Dolinský
jan.dolinsky@tangent.works
TIM
Tangent Information Modeller
Ján Dolinský
jan.dolinsky@tangent.works
Content
• Time-series Problems in Energy Industry
• Data Science Process and Automatic Model Building
• Live Modeling Demonstration
• Large-scale Forecasting Systems & Why Automation Matters
• Julia Language
• GefCom 2014, 2017 results & Summary
Time-series Problems in Energy Industry
• Electricity Load (aggregated and individual)
• Technical Losses
• System Imbalance
• Gas Consumption (District Heating)
• District Cooling
• Wind Production
• Solar Production
• Electricity Price
Consumption side Production side
1. Collect 2. Store 3. Transform 4. Model 5. Predict 6. Present
Scada Systems, Databases, Time Series Management
Oracle, MySQL, SAP HANA, Hadoop, ...
SAS, SPSS, R, ML libraries, Matlab Power BI, SAP,
QlickView etc ..
Automation Using TIM
Data Science Process
4. Model 5. Predict
SAS, SPSS, R, ML libraries, Matlab
Automation Using TIM
Data Science Process
●
feature engineering
●
model selection
●
tedious
●
costs money
Data Science Process
y(k)=f (Temp(k−3),Temp(k−22)∗Wind(k−1), y(k−24))
NN, SVM, MARS, LASSO, RF, … ?
Which features and how many ?
// optional
Data Science Process
y(k)=f (Temp(k−3),Temp(k−22)∗Wind(k−1), y(k−24))
Temp(k-1), Temp(k-2), …, Temp(k-24)
Wind(k-1), Wind(k-2), …, Wind(k-24)
y(k-1), y(k-2), …, y(k-24)
Temp(k-1), Temp(k-2), …, Temp(k-96)
Wind(k-1), Wind(k-2), …, Wind(k-96)
y(k-1), y(k-2), …, y(k-96)
24 + 24 + 24 + 24*24 = 24*27 = 648 96 + 96 + 96 + 96*96 = 96*99 = 9504
// optional
TIM: As a Large-Scale Model Building Engine
TIM: As a Data Science Tool
Business User Mode produces high-quality models.
Advanced User Mode allows to explore new scenarios quickly.
Fine-tuning possible for critical assets.
Live Demonstration of TIM
GefCom 2017 ex-ante results
• Qualifying Match: 1st
place out of 177 teams
• Final Match: 2nd
place out of 12 pre-selected teams
Andritz Hackathon 2017
1st
place out of 7 ML companies
Large-scale Forecasting Systems
Platforms where thousands of different consumption and production models are produced.
This is not possible without full automation.
TIM Worker 1
CAPI
JAPI
TIM
WS
REST/SOAP
API
TIM ConnectorsData Source
fs2TIM_WS
fs2TIM_CAPI*
fs2TIM_JAPI*
JuliaDB2TIM_WS
...
File System
HAKOM TSM
JuliaDB
InfluxDB
MySQL
PosgreSQL
Oracle
TIM Worker N
CAPI
JAPI
TIM
....
Asset data &
model management
Weather data
Visualization & UI
Grafana
Genie
...
Julia Language
Walks like Python runs like C
●
A single platform for prototyping and production → significant gains in efficiency
of product development
●
Vectorized code runs equally fast as de-vectorized
●
Vectorization on a single thread level
●
SIMD out of the box
●
Direct calls to BLAS
●
Distributed parallel computation (multiple threads)
Summary
• Fully automated model building
• Accuracy often outperforms manually build models
• Quick insights into a time-series of interest
• Tedious model building is an option not necessity
Thank you.
jan.dolinsky@tangent.works
Tangent Works
Na Slavíne 1
81106 Bratislava
www.tangent.works
The Key Concepts in Julia
Julia Lectures for Tangent Works Employees
Ján Dolinský
jan.dolinsky@tangent.works
Lectures Block 1: Vectorization vs. De-vectorization
• Iterators: The Storage Concept
• On Demand Iteration: Generators
• MapReduce principle
• Broadcasting
• Syntactic Loop Fusion
 Why is vectorized code fast ?
 Why it is not as fast as it could be ?
Lectures Block 2: Linear Algebra
• Matrices storage and its consequences
• Column-major & SIMD
• BLAS calls
Lectures Block 3: Language Design Elements
• Multiple-dispatch
• Struct
• Mutable & Immutable concept
• Modules & Packages generation
• JIT and AOT compilation
Julia Language
Walks like Python runs like C
●
A single platform for prototyping and production → significant gains in efficiency
of product development
●
Vectorized code runs equally fast as de-vectorized
●
Vectorization on a single thread level
●
SIMD out of the box
●
Direct calls to BLAS
●
Distributed parallel computation (multiple threads)
2 Years of Using Julia
• It is a real programming language
• In v0.2 to v0.4 still in heavy development
• v0.5 and v0.6 are very useful; static compilation possible
• Prototyping and production development is done using the same platform
• IDE not fully matured but already quite useful
• Debugger

More Related Content

What's hot

Fyp presentation final
Fyp presentation finalFyp presentation final
Fyp presentation finalhome
 
A calculus of mobile Real-Time processes
A calculus of mobile Real-Time processesA calculus of mobile Real-Time processes
A calculus of mobile Real-Time processes
Polytechnique Montréal
 
Information security Seminar #3
Information security Seminar #3 Information security Seminar #3
Information security Seminar #3
Alexander Kolybelnikov
 
Inductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDFInductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDF
Jose Emilio Labra Gayo
 
Data science : R Basics Harvard University
Data science : R Basics Harvard UniversityData science : R Basics Harvard University
Data science : R Basics Harvard University
MrMoliya
 
Nor Implement
Nor ImplementNor Implement
Nor Implement
sahed dewan
 
Topoloical sort
Topoloical sortTopoloical sort
Topoloical sort
sahilnarvekar
 
Topological sort
Topological sortTopological sort
Topological sort
Burhan Ahmed
 
Optimized Multi-agent Box-pushing - 2017-10-24
Optimized Multi-agent Box-pushing - 2017-10-24Optimized Multi-agent Box-pushing - 2017-10-24
Optimized Multi-agent Box-pushing - 2017-10-24
Aritra Sarkar
 
MATLAB
MATLABMATLAB
CatBoost intro
CatBoost   introCatBoost   intro
CatBoost intro
Rodion Kiryukhin
 
Discrete Logarithmic Problem- Basis of Elliptic Curve Cryptosystems
Discrete Logarithmic Problem- Basis of Elliptic Curve CryptosystemsDiscrete Logarithmic Problem- Basis of Elliptic Curve Cryptosystems
Discrete Logarithmic Problem- Basis of Elliptic Curve Cryptosystems
NIT Sikkim
 
TMPA-2017: Static Checking of Array Objects in JavaScript
TMPA-2017: Static Checking of Array Objects in JavaScriptTMPA-2017: Static Checking of Array Objects in JavaScript
TMPA-2017: Static Checking of Array Objects in JavaScript
Iosif Itkin
 
18103010 algorithm complexity (iterative)
18103010 algorithm complexity (iterative)18103010 algorithm complexity (iterative)
18103010 algorithm complexity (iterative)
AdityaKhandelwal58
 
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
Riccardo Tommasini
 
single source shorest path
single source shorest pathsingle source shorest path
single source shorest path
sowfi
 
Temporal logic and functional reactive programming
Temporal logic and functional reactive programmingTemporal logic and functional reactive programming
Temporal logic and functional reactive programming
Sergei Winitzki
 
presentation
presentationpresentation
presentationjie ren
 

What's hot (20)

Fyp presentation final
Fyp presentation finalFyp presentation final
Fyp presentation final
 
A calculus of mobile Real-Time processes
A calculus of mobile Real-Time processesA calculus of mobile Real-Time processes
A calculus of mobile Real-Time processes
 
Information security Seminar #3
Information security Seminar #3 Information security Seminar #3
Information security Seminar #3
 
Inductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDFInductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDF
 
Data science : R Basics Harvard University
Data science : R Basics Harvard UniversityData science : R Basics Harvard University
Data science : R Basics Harvard University
 
Os Urner
Os UrnerOs Urner
Os Urner
 
Nor Implement
Nor ImplementNor Implement
Nor Implement
 
Topoloical sort
Topoloical sortTopoloical sort
Topoloical sort
 
Topological sort
Topological sortTopological sort
Topological sort
 
Optimized Multi-agent Box-pushing - 2017-10-24
Optimized Multi-agent Box-pushing - 2017-10-24Optimized Multi-agent Box-pushing - 2017-10-24
Optimized Multi-agent Box-pushing - 2017-10-24
 
MATLAB
MATLABMATLAB
MATLAB
 
CatBoost intro
CatBoost   introCatBoost   intro
CatBoost intro
 
Discrete Logarithmic Problem- Basis of Elliptic Curve Cryptosystems
Discrete Logarithmic Problem- Basis of Elliptic Curve CryptosystemsDiscrete Logarithmic Problem- Basis of Elliptic Curve Cryptosystems
Discrete Logarithmic Problem- Basis of Elliptic Curve Cryptosystems
 
HHHHHH
HHHHHHHHHHHH
HHHHHH
 
TMPA-2017: Static Checking of Array Objects in JavaScript
TMPA-2017: Static Checking of Array Objects in JavaScriptTMPA-2017: Static Checking of Array Objects in JavaScript
TMPA-2017: Static Checking of Array Objects in JavaScript
 
18103010 algorithm complexity (iterative)
18103010 algorithm complexity (iterative)18103010 algorithm complexity (iterative)
18103010 algorithm complexity (iterative)
 
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
 
single source shorest path
single source shorest pathsingle source shorest path
single source shorest path
 
Temporal logic and functional reactive programming
Temporal logic and functional reactive programmingTemporal logic and functional reactive programming
Temporal logic and functional reactive programming
 
presentation
presentationpresentation
presentation
 

Similar to TIM: Large-scale Energy Forecasting in Julia

Linux and Open Source in Math, Science and Engineering
Linux and Open Source in Math, Science and EngineeringLinux and Open Source in Math, Science and Engineering
Linux and Open Source in Math, Science and Engineering
PDE1D
 
Large-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCLarge-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PC
Aapo Kyrölä
 
Automated Data Exploration: Building efficient analysis pipelines with Dask
Automated Data Exploration: Building efficient analysis pipelines with DaskAutomated Data Exploration: Building efficient analysis pipelines with Dask
Automated Data Exploration: Building efficient analysis pipelines with Dask
ASI Data Science
 
Intel realtime analytics_spark
Intel realtime analytics_sparkIntel realtime analytics_spark
Intel realtime analytics_sparkGeetanjali G
 
MDR Corporation Description at AWS Summit Tokyo
MDR Corporation Description at AWS Summit TokyoMDR Corporation Description at AWS Summit Tokyo
MDR Corporation Description at AWS Summit Tokyo
Takumi Kato
 
computer architecture.
computer architecture.computer architecture.
computer architecture.
Shivalik college of engineering
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
Mohit Garg
 
Mathematics and development of fast TLS handshakes
Mathematics and development of fast TLS handshakesMathematics and development of fast TLS handshakes
Mathematics and development of fast TLS handshakes
Alexander Krizhanovsky
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
Yahoo Developer Network
 
Cv jeanlucbordessoule
Cv jeanlucbordessouleCv jeanlucbordessoule
Cv jeanlucbordessoule
Jean-Luc Bordessoule
 
Alresume2010.Pdf
Alresume2010.PdfAlresume2010.Pdf
Alresume2010.Pdf
Angie Miller
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
Jose Quesada (hiring)
 
Warsaw Data Science - Factorization Machines Introduction
Warsaw Data Science -  Factorization Machines IntroductionWarsaw Data Science -  Factorization Machines Introduction
Warsaw Data Science - Factorization Machines Introduction
Bartlomiej Twardowski
 
Lecture 1.pptx
Lecture 1.pptxLecture 1.pptx
Lecture 1.pptx
RudraGandhi7
 
Resume
ResumeResume
Resume
muddanas
 
Srinivas Muddana Resume
Srinivas Muddana ResumeSrinivas Muddana Resume
Srinivas Muddana Resume
muddanas
 
Srinivas Muddana Resume
Srinivas Muddana ResumeSrinivas Muddana Resume
Srinivas Muddana Resume
muddanas
 
Srinivas Muddana Resume
Srinivas Muddana ResumeSrinivas Muddana Resume
Srinivas Muddana Resume
muddanas
 
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali ZaidiNatural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
Databricks
 

Similar to TIM: Large-scale Energy Forecasting in Julia (20)

Linux and Open Source in Math, Science and Engineering
Linux and Open Source in Math, Science and EngineeringLinux and Open Source in Math, Science and Engineering
Linux and Open Source in Math, Science and Engineering
 
Large-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCLarge-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PC
 
Automated Data Exploration: Building efficient analysis pipelines with Dask
Automated Data Exploration: Building efficient analysis pipelines with DaskAutomated Data Exploration: Building efficient analysis pipelines with Dask
Automated Data Exploration: Building efficient analysis pipelines with Dask
 
Intel realtime analytics_spark
Intel realtime analytics_sparkIntel realtime analytics_spark
Intel realtime analytics_spark
 
Lecture 1 (bce-7)
Lecture   1 (bce-7)Lecture   1 (bce-7)
Lecture 1 (bce-7)
 
MDR Corporation Description at AWS Summit Tokyo
MDR Corporation Description at AWS Summit TokyoMDR Corporation Description at AWS Summit Tokyo
MDR Corporation Description at AWS Summit Tokyo
 
computer architecture.
computer architecture.computer architecture.
computer architecture.
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
 
Mathematics and development of fast TLS handshakes
Mathematics and development of fast TLS handshakesMathematics and development of fast TLS handshakes
Mathematics and development of fast TLS handshakes
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
 
Cv jeanlucbordessoule
Cv jeanlucbordessouleCv jeanlucbordessoule
Cv jeanlucbordessoule
 
Alresume2010.Pdf
Alresume2010.PdfAlresume2010.Pdf
Alresume2010.Pdf
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
 
Warsaw Data Science - Factorization Machines Introduction
Warsaw Data Science -  Factorization Machines IntroductionWarsaw Data Science -  Factorization Machines Introduction
Warsaw Data Science - Factorization Machines Introduction
 
Lecture 1.pptx
Lecture 1.pptxLecture 1.pptx
Lecture 1.pptx
 
Resume
ResumeResume
Resume
 
Srinivas Muddana Resume
Srinivas Muddana ResumeSrinivas Muddana Resume
Srinivas Muddana Resume
 
Srinivas Muddana Resume
Srinivas Muddana ResumeSrinivas Muddana Resume
Srinivas Muddana Resume
 
Srinivas Muddana Resume
Srinivas Muddana ResumeSrinivas Muddana Resume
Srinivas Muddana Resume
 
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali ZaidiNatural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
 

Recently uploaded

Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 

Recently uploaded (20)

Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 

TIM: Large-scale Energy Forecasting in Julia

  • 1. TIM: Large-scale Forecasting for Energy in Julia Ján Dolinský, Tangent Works #PyDataBA19. 2. 2018 (PyData Bratislava Meetup #6, Nervosa)
  • 2.
  • 3. The Two Language Problem The Philosofical Part Ján Dolinský jan.dolinsky@tangent.works
  • 4. Math & Stats Information Geometry Computer Science Domain Expertise Machine Learning Traditional Software Traditional Research VALUE
  • 5. Math & Stats Information Geometry Computer Science Domain Expertise Machine Learning Traditional Software Traditional Research VALUE Data Science Stack: R, Matlab, Python, Ruby, ... Production Stack: C++, Java, Fortran, ...
  • 6. The Two Language Problem Prototyping Production Mathematicians & Data Scientists Getting the applied math right. Performance engineers & Architects.
  • 7. The Two Language Problem: Tangent Works in 2014 Prototyping Production The Data Science Stack: R, Matlab, Octave, Python, Ruby, …. Production stack: C++, Fortran, Java, ... A lot of effort ...
  • 8. The Two Language Problem: Tangent Works in 2018 Prototyping Production Julia Julia Mathematicians Getting the applied math right. Performance engineers & Architects.
  • 9. TIM: Automatic Model Building for Time-series in Energy The Product Part Ján Dolinský jan.dolinsky@tangent.works
  • 10. TIM Tangent Information Modeller Ján Dolinský jan.dolinsky@tangent.works
  • 11. Content • Time-series Problems in Energy Industry • Data Science Process and Automatic Model Building • Live Modeling Demonstration • Large-scale Forecasting Systems & Why Automation Matters • Julia Language • GefCom 2014, 2017 results & Summary
  • 12. Time-series Problems in Energy Industry • Electricity Load (aggregated and individual) • Technical Losses • System Imbalance • Gas Consumption (District Heating) • District Cooling • Wind Production • Solar Production • Electricity Price Consumption side Production side
  • 13. 1. Collect 2. Store 3. Transform 4. Model 5. Predict 6. Present Scada Systems, Databases, Time Series Management Oracle, MySQL, SAP HANA, Hadoop, ... SAS, SPSS, R, ML libraries, Matlab Power BI, SAP, QlickView etc .. Automation Using TIM Data Science Process
  • 14. 4. Model 5. Predict SAS, SPSS, R, ML libraries, Matlab Automation Using TIM Data Science Process ● feature engineering ● model selection ● tedious ● costs money
  • 15. Data Science Process y(k)=f (Temp(k−3),Temp(k−22)∗Wind(k−1), y(k−24)) NN, SVM, MARS, LASSO, RF, … ? Which features and how many ? // optional
  • 16. Data Science Process y(k)=f (Temp(k−3),Temp(k−22)∗Wind(k−1), y(k−24)) Temp(k-1), Temp(k-2), …, Temp(k-24) Wind(k-1), Wind(k-2), …, Wind(k-24) y(k-1), y(k-2), …, y(k-24) Temp(k-1), Temp(k-2), …, Temp(k-96) Wind(k-1), Wind(k-2), …, Wind(k-96) y(k-1), y(k-2), …, y(k-96) 24 + 24 + 24 + 24*24 = 24*27 = 648 96 + 96 + 96 + 96*96 = 96*99 = 9504 // optional
  • 17. TIM: As a Large-Scale Model Building Engine TIM: As a Data Science Tool Business User Mode produces high-quality models. Advanced User Mode allows to explore new scenarios quickly. Fine-tuning possible for critical assets.
  • 19. GefCom 2017 ex-ante results • Qualifying Match: 1st place out of 177 teams • Final Match: 2nd place out of 12 pre-selected teams Andritz Hackathon 2017 1st place out of 7 ML companies
  • 20. Large-scale Forecasting Systems Platforms where thousands of different consumption and production models are produced. This is not possible without full automation.
  • 21. TIM Worker 1 CAPI JAPI TIM WS REST/SOAP API TIM ConnectorsData Source fs2TIM_WS fs2TIM_CAPI* fs2TIM_JAPI* JuliaDB2TIM_WS ... File System HAKOM TSM JuliaDB InfluxDB MySQL PosgreSQL Oracle TIM Worker N CAPI JAPI TIM .... Asset data & model management Weather data Visualization & UI Grafana Genie ...
  • 22. Julia Language Walks like Python runs like C ● A single platform for prototyping and production → significant gains in efficiency of product development ● Vectorized code runs equally fast as de-vectorized ● Vectorization on a single thread level ● SIMD out of the box ● Direct calls to BLAS ● Distributed parallel computation (multiple threads)
  • 23. Summary • Fully automated model building • Accuracy often outperforms manually build models • Quick insights into a time-series of interest • Tedious model building is an option not necessity
  • 24. Thank you. jan.dolinsky@tangent.works Tangent Works Na Slavíne 1 81106 Bratislava www.tangent.works
  • 25.
  • 26. The Key Concepts in Julia Julia Lectures for Tangent Works Employees Ján Dolinský jan.dolinsky@tangent.works
  • 27. Lectures Block 1: Vectorization vs. De-vectorization • Iterators: The Storage Concept • On Demand Iteration: Generators • MapReduce principle • Broadcasting • Syntactic Loop Fusion  Why is vectorized code fast ?  Why it is not as fast as it could be ?
  • 28. Lectures Block 2: Linear Algebra • Matrices storage and its consequences • Column-major & SIMD • BLAS calls
  • 29. Lectures Block 3: Language Design Elements • Multiple-dispatch • Struct • Mutable & Immutable concept • Modules & Packages generation • JIT and AOT compilation
  • 30. Julia Language Walks like Python runs like C ● A single platform for prototyping and production → significant gains in efficiency of product development ● Vectorized code runs equally fast as de-vectorized ● Vectorization on a single thread level ● SIMD out of the box ● Direct calls to BLAS ● Distributed parallel computation (multiple threads)
  • 31. 2 Years of Using Julia • It is a real programming language • In v0.2 to v0.4 still in heavy development • v0.5 and v0.6 are very useful; static compilation possible • Prototyping and production development is done using the same platform • IDE not fully matured but already quite useful • Debugger