SlideShare a Scribd company logo
1 of 56
OpenEEmeter 4.0:
Release Webinar
Antitrust Policy Notice
Linux Foundation meetings involve participation by industry competitors, and it is the
intention of the Linux Foundation to conduct all of its activities in accordance with
applicable antitrust and competition laws. It is therefore extremely important that
attendees adhere to meeting agendas, and be aware of, and not participate in, any
activities that are prohibited under applicable US state, federal or foreign antitrust and
competition laws.
Examples of types of actions that are prohibited at Linux Foundation meetings and in
connection with Linux Foundation activities are described in the Linux Foundation
Antitrust Policy available at linuxfoundation.org/antitrust-policy. If you have
questions about these matters, please contact your company counsel, or if you are a
member of the Linux Foundation, feel free to contact Andrew Updegrove of the firm of
Gesmer Updegrove LLP, which provides legal counsel to the Linux Foundation.
● Purpose and Brief History
● Methods Review
● Issues and Key Results
● Methods Advancements (The How)
○ Accuracy
○ Speed
● API Improvements
Agenda
OpenEEmeter:
Purpose and Brief History
Establish “weights and measures” for
demand side programs
Enable our industry to compete at
scale, including against supply side
options
Remove measurement barriers to
integrated programs
Purpose
… And Many Others
The Work of …
2012/2013
“CalTRACK” methods
development initiated to
calibrate building
software tools
2017
OpenEEmeter 3.0:
- Daily Improvements
- Hourly Methods
OpenEEmeter Timeline
2016
OpenEEmeter 1.0:
- Monthly Methods
- Daily Methods
- OpenEEmeter
2019
OpenEEmeter joined
LF Energy as open
source project
7
2024
OpenEEmeter 4.0:
- New Daily Model
- Vastly Improved API
The industry is changing fast
- Energy
- Utility
- Demand Side Program
We need measurement capabilities that
enable, not inhibit, modern programs
Times They Are A-Changin’
8
OpenEEmeter 3.0:
Methods Review and Issues
Baseline Model
Reporting Period
Counterfactual
Intervention
Time
Energy
Savings
Blackout
Period
Baseline Period
OpenEEmeter Savings Calculation
10
Balance Point Temp
Temp-Independent
Linear (HDD)
OpenEEmeter Daily
11
We’ve addressed 3 main issues:
● Seasonal Bias
● Weekday/Weekend Bias
● Computational Efficiency
Know Thy Enemy: OEEM 3.0 Modeling Issues
12
Seasonal Bias Distribution
13
3.0
4.0
- 7.53% Bias
- 0.05% Bias
3.0 vs. 4.0: Residential Gas Winter Bias
14
3.0 vs. 4.0 Seasonal Bias: Individual Meter
15
3.0 vs. 4.0 Weekend Bias: Individual Meter
16
OpenEEmeter 3.0 = Sloooowww
● Exhaustive grid search
● 1,891 models for every meter
● 20 - 60 seconds per meter
OpenEEmeter 4.0 = Efficient
● Can replicate OpenEEmeter 3.0 at ~0.5 seconds per meter (~100x faster)
Computational Efficiency
17
OpenEEmeter 4.0:
How is it more accurate?
Identified Issues: Seasonal Bias
Occurs when behavior changes with seasons
19
Identified Issues: Seasonal Bias
Solution: Split seasons and fit models based on splits
20
Identified Issues: Weekday/Weekend Bias
Occurs when behavior changes based on type of day
Weekend
Weekday
21
Weekend
Weekday
Identified Issues: Weekday/Weekend Bias
Solution: Split on weekday/weekend
22
Identified Issues: Linear Model
Solution: Add smoothing between linear components
23
Identified Issues: Ordinary Least Squares
OLS fits can lead to non-predictive models
24
Identified Issues: Ordinary Least Squares
Solution: Adaptive, robust loss function to down-weight outliers
Standard Deviations from Mean
Loss
Response
25
OpenEEmeter 4.0:
How is it faster?
Identified Issues: Computational Efficiency
Grid search is slow compared to global optimization
Grid Search
9 evaluations
Optimization
9 evaluations
27
Identified Issues: Computational Efficiency
Grid search is slow compared to global optimization
Grid Search
25 evaluations
Optimization
9 evaluations
1891 models created 1.5 models created
28
Secret ingredient #1: Balance Point Optimization
● Initial guess: BP at 10% and 90% of data
● Use DIRECT global optimization method
Identified Issues: Computational Efficiency
Solution: Use optimization to find balance points
Initial guesses
29
Identified Issues: Computational Efficiency
Solution: Use Elastic Net to only fit one model and penalize coefficients
Ordinary Least Squares goal
● Minimize residuals
Elastic Net goal
● Minimize residuals + coefficients
30
Identified Issues: Computational Efficiency
Solution: Use Elastic Net to only fit one model and penalize coefficients
Ordinary Least Squares goal
● Minimize residuals
Elastic Net goal
● Minimize residuals + coefficients
Secret ingredient #2
31
OpenEEmeter 4.0:
How do we know when to split
How do we choose to split or not?
Strive for optimal fitting using test error
No splitting All possible
splits
33
Experimental Design Considerations
Cannot assess test error from reporting period
Bad Assumption
Reporting period = Baseline period
Wouldn’t it be nice if…
We could exclusively use baseline data and
achieve predictive testing
We need some tools!
34
Average performance
of all folds
Why do we need CV?
● Best model parameters
Where? → baseline period!
Goal? → predictive testing
Experimental Design Considerations
Can assess predictive error using cross validation
35
Cross validation
● Useful in development
● Untenable in final product → computational time
Can we approximate CV?
● Yes! Selection criterion
● Selection Criterion = SSE + penalty
● But it's meant to reduce master model
Final Model
Cross validation would be too slow for 1M buildings
36
What is the best penalization for model complexity
● Selection Criterion = SSE + penalty
(Modified Bayesian Information Criterion)
What are best parameters?
● Based on 10-fold cross validation RMSE (predictive)
● 6000 meters (4000 res gas, 1000 res elec, 1000 comm elec)
Final Model
Create a selection criterion function to select splits
37
Final Model
Improved accuracy and speed (no free lunch theorem)
38
Final Model
39
● Model Specification and Results (google: OpenEEmeter 4.0)
OpenEEmeter 4.0:
API Improvements
API Improvements
Inspired by Sklearn’s simplicity
● Sklearn manages many complex models with a simple interface
● We should do the same
cluster_algo = [
cluster.MiniBatchKMeans(),
cluster.AgglomerativeClustering(),
cluster.Birch(),
cluster.DBSCAN(),
]
for algo in cluster_algo:
algo.fit(X)
res = algo.predict(X_new)
regres_algo = [
linear_model.LinearRegression(),
linear_model.ElasticNet(),
linear_model.BayesianRidge(),
linear_model.RANSACRegressor(),
]
for algo in regres_algo:
algo.fit(X, y)
res = algo.predict(X_new)
Clustering API Regression API
Completely
different, but
almost same
API?
41
OpenEEmeter 3.0 OpenEEmeter 4.0
● Most steps copied from tutorials
(user feedback)
● Different processes for daily and
hourly modeling
● User sets options in function calls
● Intermediate information passed
between function calls
● Simple function calls:
initialize/fit/predict
● Same function calls regardless of
model
● Sensible defaults
● Intermediate information within
class
API Improvements
Goal is ease of use
42
baseline_design_matrix = create_caltrack_daily_design_matrix(
baseline_meter_data, temperature_data, degc
)
baseline_model = fit_caltrack_usage_per_day_model(baseline_design_matrix)
reporting_meter_data, warnings = get_reporting_data(
meter_data, start=blackout_end_date, max_days=365
)
metered_savings_dataframe, error_bands = metered_savings(
baseline_model,
reporting_meter_data,
temperature_data,
with_disaggregated=True,
degc=degc,
)
baseline_design_matrix = create_caltrack_billing_design_matrix(
baseline_meter_data, temperature_data, degc
)
baseline_model = fit_caltrack_usage_per_day_model(
baseline_design_matrix
use_billing_presets=True,
weights_col='n_days_kept',
)
reporting_meter_data, warnings = get_reporting_data(
meter_data, start=blackout_end_date, max_days=365
)
metered_savings_dataframe, error_bands = metered_savings(
baseline_model,
reporting_meter_data,
temperature_data,
with_disaggregated=True,
degc=degc,
)
baseline_data = DailyBaselineData(baseline_df)
reporting_data = DailyReportingData(reporting_df)
model = DailyModel(settings=None).fit(baseline_data)
result = model.predict(reporting_data)
OpenEEmeter 3.0 OpenEEmeter 4.0
baseline_data = BillingBaselineData(baseline_df)
reporting_data = BillingReportingData(reporting_df)
model = BillingModel(settings=None).fit(baseline_data)
result = model.predict(reporting_data)
Daily
Billing
Simplified Daily and Billing Models
43
# create a design matrix for occupancy and segmentation
preliminary_design_matrix = create_caltrack_hourly_preliminary_design_matrix(
baseline_meter_data, temperature_data, degc
)
# build 12 monthly models - each step from now on operates on each segment
segmentation = segment_time_series(
preliminary_design_matrix.index, "three_month_weighted"
)
# assign an occupancy status to each hour of the week (0-167)
occupancy_lookup = estimate_hour_of_week_occupancy(
preliminary_design_matrix, segmentation=segmentation
)
# assign temperatures to bins
(
occupied_temperature_bins,
unoccupied_temperature_bins,
) = fit_temperature_bins(
preliminary_design_matrix,
segmentation=segmentation,
occupancy_lookup=occupancy_lookup,
)
# build a design matrix for each monthly segment
segmented_design_matrices = create_caltrack_hourly_segmented_design_matrices(
preliminary_design_matrix,
segmentation,
occupancy_lookup,
occupied_temperature_bins,
unoccupied_temperature_bins,
)
# build a CalTRACK hourly model
baseline_model = fit_caltrack_hourly_model(
segmented_design_matrices,
occupancy_lookup,
occupied_temperature_bins,
unoccupied_temperature_bins,
)
# compute metered savings for the year of the reporting period we've selected
result, error_bands = metered_savings(
baseline_model,
reporting_meter_data,
temperature_data,
with_disaggregated=True,
degc=degc,
)
baseline_data = HourlyBaselineData(baseline_df)
reporting_data = HourlyReportingData(reporting_df)
model = HourlyModel(settings=None).fit(baseline_data)
result = model.predict(reporting_data)
OpenEEmeter 3.0 OpenEEmeter 4.0
Simplified Hourly Model
44
Data Class
Tracks disqualification and formats data for Model class
● Track all data sufficiency
● Unique for each model type
● Must be run to pass to Model
(Can bypass in model)
● Formats data for Model class
● Violations are propagated to
Model class
baseline_data = BaselineData(baseline_df)
baseline_data.disqualification
baseline_data.warnings
Disqualification -
{
'qualified_name':
'eemeter.sufficiency_criteria.too_many_days_with_missing_data',
'description': 'Too many days in data have missing meter data or
temperature data.',
'data': {'n_valid_days': 251, 'n_days_total' : 365}}
}
Warnings -
{'qualified_name':
'eemeter.sufficiency_criteria.missing_high_frequency_meter_data',
'description': 'More than 50% of the high frequency Meter data is
missing.',
'data': [Timestamp('2020-02-29 00:00:00+0000', tz='UTC')]
}
45
OpenEEmeter 4.0:
Conclusions
Conclusion
Model
● 84% less seasonal bias
● 95% less weekday/weekend bias
● Daily model is 2 - 10x faster
● Billing model is 100x faster
● Hyperparameters are broadly applicable
47
pip install eemeter
Conclusion
API
● Standard calls for all models (fit/predict)
● Data class
○ Formats data for models
○ Checks sufficiency
○ Provides disqualification reasons
48
pip install eemeter
Technical Steering Committee:
● Adam Scheer, Recurve
● McGee Young, WattCarbon
● Phil Ngo, Recurve
● Travis Sikes, Recurve
● Steve Suffian, WattCarbon
Key Contributors
● Armin Aligholian, Recurve
● Jason Chulock, Recurve
● Joydeep Nag, Recurve
● Ethan Goldman, Resilient Edge
● Matt Fawcett, Carbon Co-op
● James Fenna, Carbon Co-op
49
People
Ongoing Work: Hourly Model!
Hourly Model
● 10x faster
● Huge improvement for solar
PV customers
● More flexible
● Data class
50
Error
Improvement
%
Percent Daily Cloudiness
Solar PV Customers
https://www.caltrack.org/technical-working-group.html
Join the working group!
Questions?
Appendix
Seasonal error
profiles: sharp
features around
balance point
temperature
Space heating
initiated at warmer
outside temps in
winter
Why is there Seasonal Bias?
Identified Issues: Ordinary Least Squares
Solution: Adaptive, robust loss function to down-weight outliers
54
Standard Deviations from Mean
Loss
Response
Identified Issues: Computational Efficiency
Solution: Only fit components once
55
Secret ingredient #3: Reuse component fits
● ~40-50 possible combinations of components
● Save component fits and reuse
Identified Issues: Computational Efficiency
Solution: Eliminate potential splits through overlapping clusters
56
Secret ingredient #4

More Related Content

Similar to LF Energy Webinar - Unveiling OpenEEMeter 4.0

Sumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics MasterySumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics MasterySumo Logic
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptopRising Media, Inc.
 
02 intel v_tune_session_02
02 intel v_tune_session_0202 intel v_tune_session_02
02 intel v_tune_session_02Vivek chan
 
OR-I_Lecture_Note_01.pptx
OR-I_Lecture_Note_01.pptxOR-I_Lecture_Note_01.pptx
OR-I_Lecture_Note_01.pptxssuserf19f3e
 
Reactive Performance Testing
Reactive Performance TestingReactive Performance Testing
Reactive Performance TestingLilit Yenokyan
 
Customer choice probabilities
Customer choice probabilitiesCustomer choice probabilities
Customer choice probabilitiesAllan D. Butler
 
B2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingB2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingSteve Feldman
 
B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)Steve Feldman
 
Icpe2015 weiyi shang (1)
Icpe2015 weiyi shang (1)Icpe2015 weiyi shang (1)
Icpe2015 weiyi shang (1)SAIL_QU
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaDatabricks
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...HostedbyConfluent
 
Java Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth RoundJava Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth RoundAnnibale Panichella
 
Effective Testing Practices in an Agile Environment
Effective Testing Practices in an Agile EnvironmentEffective Testing Practices in an Agile Environment
Effective Testing Practices in an Agile EnvironmentRaj Indugula
 
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)Mauro Vallati
 
Gatling workshop lets test17
Gatling workshop lets test17Gatling workshop lets test17
Gatling workshop lets test17Gerald Muecke
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Greg Makowski
 
Sumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics MasterySumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics MasterySumo Logic
 
Dev buchan 30 proven tips
Dev buchan 30 proven tipsDev buchan 30 proven tips
Dev buchan 30 proven tipsBill Buchan
 

Similar to LF Energy Webinar - Unveiling OpenEEMeter 4.0 (20)

Sumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics MasterySumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics Mastery
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 
02 intel v_tune_session_02
02 intel v_tune_session_0202 intel v_tune_session_02
02 intel v_tune_session_02
 
OR-I_Lecture_Note_01.pptx
OR-I_Lecture_Note_01.pptxOR-I_Lecture_Note_01.pptx
OR-I_Lecture_Note_01.pptx
 
Reactive Performance Testing
Reactive Performance TestingReactive Performance Testing
Reactive Performance Testing
 
Customer choice probabilities
Customer choice probabilitiesCustomer choice probabilities
Customer choice probabilities
 
B2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingB2 2006 sizing_benchmarking
B2 2006 sizing_benchmarking
 
B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)
 
Icpe2015 weiyi shang (1)
Icpe2015 weiyi shang (1)Icpe2015 weiyi shang (1)
Icpe2015 weiyi shang (1)
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at Humana
 
Software testing Report
Software testing ReportSoftware testing Report
Software testing Report
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
 
Java Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth RoundJava Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth Round
 
Effective Testing Practices in an Agile Environment
Effective Testing Practices in an Agile EnvironmentEffective Testing Practices in an Agile Environment
Effective Testing Practices in an Agile Environment
 
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
 
Gatling workshop lets test17
Gatling workshop lets test17Gatling workshop lets test17
Gatling workshop lets test17
 
Test AI/ML Applications
Test AI/ML ApplicationsTest AI/ML Applications
Test AI/ML Applications
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
 
Sumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics MasterySumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics Mastery
 
Dev buchan 30 proven tips
Dev buchan 30 proven tipsDev buchan 30 proven tips
Dev buchan 30 proven tips
 

Recently uploaded

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 

LF Energy Webinar - Unveiling OpenEEMeter 4.0

  • 2. Antitrust Policy Notice Linux Foundation meetings involve participation by industry competitors, and it is the intention of the Linux Foundation to conduct all of its activities in accordance with applicable antitrust and competition laws. It is therefore extremely important that attendees adhere to meeting agendas, and be aware of, and not participate in, any activities that are prohibited under applicable US state, federal or foreign antitrust and competition laws. Examples of types of actions that are prohibited at Linux Foundation meetings and in connection with Linux Foundation activities are described in the Linux Foundation Antitrust Policy available at linuxfoundation.org/antitrust-policy. If you have questions about these matters, please contact your company counsel, or if you are a member of the Linux Foundation, feel free to contact Andrew Updegrove of the firm of Gesmer Updegrove LLP, which provides legal counsel to the Linux Foundation.
  • 3. ● Purpose and Brief History ● Methods Review ● Issues and Key Results ● Methods Advancements (The How) ○ Accuracy ○ Speed ● API Improvements Agenda
  • 5. Establish “weights and measures” for demand side programs Enable our industry to compete at scale, including against supply side options Remove measurement barriers to integrated programs Purpose
  • 6. … And Many Others The Work of …
  • 7. 2012/2013 “CalTRACK” methods development initiated to calibrate building software tools 2017 OpenEEmeter 3.0: - Daily Improvements - Hourly Methods OpenEEmeter Timeline 2016 OpenEEmeter 1.0: - Monthly Methods - Daily Methods - OpenEEmeter 2019 OpenEEmeter joined LF Energy as open source project 7 2024 OpenEEmeter 4.0: - New Daily Model - Vastly Improved API
  • 8. The industry is changing fast - Energy - Utility - Demand Side Program We need measurement capabilities that enable, not inhibit, modern programs Times They Are A-Changin’ 8
  • 11. Balance Point Temp Temp-Independent Linear (HDD) OpenEEmeter Daily 11
  • 12. We’ve addressed 3 main issues: ● Seasonal Bias ● Weekday/Weekend Bias ● Computational Efficiency Know Thy Enemy: OEEM 3.0 Modeling Issues 12
  • 14. 3.0 4.0 - 7.53% Bias - 0.05% Bias 3.0 vs. 4.0: Residential Gas Winter Bias 14
  • 15. 3.0 vs. 4.0 Seasonal Bias: Individual Meter 15
  • 16. 3.0 vs. 4.0 Weekend Bias: Individual Meter 16
  • 17. OpenEEmeter 3.0 = Sloooowww ● Exhaustive grid search ● 1,891 models for every meter ● 20 - 60 seconds per meter OpenEEmeter 4.0 = Efficient ● Can replicate OpenEEmeter 3.0 at ~0.5 seconds per meter (~100x faster) Computational Efficiency 17
  • 18. OpenEEmeter 4.0: How is it more accurate?
  • 19. Identified Issues: Seasonal Bias Occurs when behavior changes with seasons 19
  • 20. Identified Issues: Seasonal Bias Solution: Split seasons and fit models based on splits 20
  • 21. Identified Issues: Weekday/Weekend Bias Occurs when behavior changes based on type of day Weekend Weekday 21
  • 22. Weekend Weekday Identified Issues: Weekday/Weekend Bias Solution: Split on weekday/weekend 22
  • 23. Identified Issues: Linear Model Solution: Add smoothing between linear components 23
  • 24. Identified Issues: Ordinary Least Squares OLS fits can lead to non-predictive models 24
  • 25. Identified Issues: Ordinary Least Squares Solution: Adaptive, robust loss function to down-weight outliers Standard Deviations from Mean Loss Response 25
  • 27. Identified Issues: Computational Efficiency Grid search is slow compared to global optimization Grid Search 9 evaluations Optimization 9 evaluations 27
  • 28. Identified Issues: Computational Efficiency Grid search is slow compared to global optimization Grid Search 25 evaluations Optimization 9 evaluations 1891 models created 1.5 models created 28
  • 29. Secret ingredient #1: Balance Point Optimization ● Initial guess: BP at 10% and 90% of data ● Use DIRECT global optimization method Identified Issues: Computational Efficiency Solution: Use optimization to find balance points Initial guesses 29
  • 30. Identified Issues: Computational Efficiency Solution: Use Elastic Net to only fit one model and penalize coefficients Ordinary Least Squares goal ● Minimize residuals Elastic Net goal ● Minimize residuals + coefficients 30
  • 31. Identified Issues: Computational Efficiency Solution: Use Elastic Net to only fit one model and penalize coefficients Ordinary Least Squares goal ● Minimize residuals Elastic Net goal ● Minimize residuals + coefficients Secret ingredient #2 31
  • 32. OpenEEmeter 4.0: How do we know when to split
  • 33. How do we choose to split or not? Strive for optimal fitting using test error No splitting All possible splits 33
  • 34. Experimental Design Considerations Cannot assess test error from reporting period Bad Assumption Reporting period = Baseline period Wouldn’t it be nice if… We could exclusively use baseline data and achieve predictive testing We need some tools! 34
  • 35. Average performance of all folds Why do we need CV? ● Best model parameters Where? → baseline period! Goal? → predictive testing Experimental Design Considerations Can assess predictive error using cross validation 35
  • 36. Cross validation ● Useful in development ● Untenable in final product → computational time Can we approximate CV? ● Yes! Selection criterion ● Selection Criterion = SSE + penalty ● But it's meant to reduce master model Final Model Cross validation would be too slow for 1M buildings 36
  • 37. What is the best penalization for model complexity ● Selection Criterion = SSE + penalty (Modified Bayesian Information Criterion) What are best parameters? ● Based on 10-fold cross validation RMSE (predictive) ● 6000 meters (4000 res gas, 1000 res elec, 1000 comm elec) Final Model Create a selection criterion function to select splits 37
  • 38. Final Model Improved accuracy and speed (no free lunch theorem) 38
  • 39. Final Model 39 ● Model Specification and Results (google: OpenEEmeter 4.0)
  • 41. API Improvements Inspired by Sklearn’s simplicity ● Sklearn manages many complex models with a simple interface ● We should do the same cluster_algo = [ cluster.MiniBatchKMeans(), cluster.AgglomerativeClustering(), cluster.Birch(), cluster.DBSCAN(), ] for algo in cluster_algo: algo.fit(X) res = algo.predict(X_new) regres_algo = [ linear_model.LinearRegression(), linear_model.ElasticNet(), linear_model.BayesianRidge(), linear_model.RANSACRegressor(), ] for algo in regres_algo: algo.fit(X, y) res = algo.predict(X_new) Clustering API Regression API Completely different, but almost same API? 41
  • 42. OpenEEmeter 3.0 OpenEEmeter 4.0 ● Most steps copied from tutorials (user feedback) ● Different processes for daily and hourly modeling ● User sets options in function calls ● Intermediate information passed between function calls ● Simple function calls: initialize/fit/predict ● Same function calls regardless of model ● Sensible defaults ● Intermediate information within class API Improvements Goal is ease of use 42
  • 43. baseline_design_matrix = create_caltrack_daily_design_matrix( baseline_meter_data, temperature_data, degc ) baseline_model = fit_caltrack_usage_per_day_model(baseline_design_matrix) reporting_meter_data, warnings = get_reporting_data( meter_data, start=blackout_end_date, max_days=365 ) metered_savings_dataframe, error_bands = metered_savings( baseline_model, reporting_meter_data, temperature_data, with_disaggregated=True, degc=degc, ) baseline_design_matrix = create_caltrack_billing_design_matrix( baseline_meter_data, temperature_data, degc ) baseline_model = fit_caltrack_usage_per_day_model( baseline_design_matrix use_billing_presets=True, weights_col='n_days_kept', ) reporting_meter_data, warnings = get_reporting_data( meter_data, start=blackout_end_date, max_days=365 ) metered_savings_dataframe, error_bands = metered_savings( baseline_model, reporting_meter_data, temperature_data, with_disaggregated=True, degc=degc, ) baseline_data = DailyBaselineData(baseline_df) reporting_data = DailyReportingData(reporting_df) model = DailyModel(settings=None).fit(baseline_data) result = model.predict(reporting_data) OpenEEmeter 3.0 OpenEEmeter 4.0 baseline_data = BillingBaselineData(baseline_df) reporting_data = BillingReportingData(reporting_df) model = BillingModel(settings=None).fit(baseline_data) result = model.predict(reporting_data) Daily Billing Simplified Daily and Billing Models 43
  • 44. # create a design matrix for occupancy and segmentation preliminary_design_matrix = create_caltrack_hourly_preliminary_design_matrix( baseline_meter_data, temperature_data, degc ) # build 12 monthly models - each step from now on operates on each segment segmentation = segment_time_series( preliminary_design_matrix.index, "three_month_weighted" ) # assign an occupancy status to each hour of the week (0-167) occupancy_lookup = estimate_hour_of_week_occupancy( preliminary_design_matrix, segmentation=segmentation ) # assign temperatures to bins ( occupied_temperature_bins, unoccupied_temperature_bins, ) = fit_temperature_bins( preliminary_design_matrix, segmentation=segmentation, occupancy_lookup=occupancy_lookup, ) # build a design matrix for each monthly segment segmented_design_matrices = create_caltrack_hourly_segmented_design_matrices( preliminary_design_matrix, segmentation, occupancy_lookup, occupied_temperature_bins, unoccupied_temperature_bins, ) # build a CalTRACK hourly model baseline_model = fit_caltrack_hourly_model( segmented_design_matrices, occupancy_lookup, occupied_temperature_bins, unoccupied_temperature_bins, ) # compute metered savings for the year of the reporting period we've selected result, error_bands = metered_savings( baseline_model, reporting_meter_data, temperature_data, with_disaggregated=True, degc=degc, ) baseline_data = HourlyBaselineData(baseline_df) reporting_data = HourlyReportingData(reporting_df) model = HourlyModel(settings=None).fit(baseline_data) result = model.predict(reporting_data) OpenEEmeter 3.0 OpenEEmeter 4.0 Simplified Hourly Model 44
  • 45. Data Class Tracks disqualification and formats data for Model class ● Track all data sufficiency ● Unique for each model type ● Must be run to pass to Model (Can bypass in model) ● Formats data for Model class ● Violations are propagated to Model class baseline_data = BaselineData(baseline_df) baseline_data.disqualification baseline_data.warnings Disqualification - { 'qualified_name': 'eemeter.sufficiency_criteria.too_many_days_with_missing_data', 'description': 'Too many days in data have missing meter data or temperature data.', 'data': {'n_valid_days': 251, 'n_days_total' : 365}} } Warnings - {'qualified_name': 'eemeter.sufficiency_criteria.missing_high_frequency_meter_data', 'description': 'More than 50% of the high frequency Meter data is missing.', 'data': [Timestamp('2020-02-29 00:00:00+0000', tz='UTC')] } 45
  • 47. Conclusion Model ● 84% less seasonal bias ● 95% less weekday/weekend bias ● Daily model is 2 - 10x faster ● Billing model is 100x faster ● Hyperparameters are broadly applicable 47 pip install eemeter
  • 48. Conclusion API ● Standard calls for all models (fit/predict) ● Data class ○ Formats data for models ○ Checks sufficiency ○ Provides disqualification reasons 48 pip install eemeter
  • 49. Technical Steering Committee: ● Adam Scheer, Recurve ● McGee Young, WattCarbon ● Phil Ngo, Recurve ● Travis Sikes, Recurve ● Steve Suffian, WattCarbon Key Contributors ● Armin Aligholian, Recurve ● Jason Chulock, Recurve ● Joydeep Nag, Recurve ● Ethan Goldman, Resilient Edge ● Matt Fawcett, Carbon Co-op ● James Fenna, Carbon Co-op 49 People
  • 50. Ongoing Work: Hourly Model! Hourly Model ● 10x faster ● Huge improvement for solar PV customers ● More flexible ● Data class 50 Error Improvement % Percent Daily Cloudiness Solar PV Customers https://www.caltrack.org/technical-working-group.html Join the working group!
  • 53. Seasonal error profiles: sharp features around balance point temperature Space heating initiated at warmer outside temps in winter Why is there Seasonal Bias?
  • 54. Identified Issues: Ordinary Least Squares Solution: Adaptive, robust loss function to down-weight outliers 54 Standard Deviations from Mean Loss Response
  • 55. Identified Issues: Computational Efficiency Solution: Only fit components once 55 Secret ingredient #3: Reuse component fits ● ~40-50 possible combinations of components ● Save component fits and reuse
  • 56. Identified Issues: Computational Efficiency Solution: Eliminate potential splits through overlapping clusters 56 Secret ingredient #4