SlideShare a Scribd company logo
OpenEEmeter 4.0:
Release Webinar
Antitrust Policy Notice
Linux Foundation meetings involve participation by industry competitors, and it is the
intention of the Linux Foundation to conduct all of its activities in accordance with
applicable antitrust and competition laws. It is therefore extremely important that
attendees adhere to meeting agendas, and be aware of, and not participate in, any
activities that are prohibited under applicable US state, federal or foreign antitrust and
competition laws.
Examples of types of actions that are prohibited at Linux Foundation meetings and in
connection with Linux Foundation activities are described in the Linux Foundation
Antitrust Policy available at linuxfoundation.org/antitrust-policy. If you have
questions about these matters, please contact your company counsel, or if you are a
member of the Linux Foundation, feel free to contact Andrew Updegrove of the firm of
Gesmer Updegrove LLP, which provides legal counsel to the Linux Foundation.
● Purpose and Brief History
● Methods Review
● Issues and Key Results
● Methods Advancements (The How)
○ Accuracy
○ Speed
● API Improvements
Agenda
OpenEEmeter:
Purpose and Brief History
Establish “weights and measures” for
demand side programs
Enable our industry to compete at
scale, including against supply side
options
Remove measurement barriers to
integrated programs
Purpose
… And Many Others
The Work of …
2012/2013
“CalTRACK” methods
development initiated to
calibrate building
software tools
2017
OpenEEmeter 3.0:
- Daily Improvements
- Hourly Methods
OpenEEmeter Timeline
2016
OpenEEmeter 1.0:
- Monthly Methods
- Daily Methods
- OpenEEmeter
2019
OpenEEmeter joined
LF Energy as open
source project
7
2024
OpenEEmeter 4.0:
- New Daily Model
- Vastly Improved API
The industry is changing fast
- Energy
- Utility
- Demand Side Program
We need measurement capabilities that
enable, not inhibit, modern programs
Times They Are A-Changin’
8
OpenEEmeter 3.0:
Methods Review and Issues
Baseline Model
Reporting Period
Counterfactual
Intervention
Time
Energy
Savings
Blackout
Period
Baseline Period
OpenEEmeter Savings Calculation
10
Balance Point Temp
Temp-Independent
Linear (HDD)
OpenEEmeter Daily
11
We’ve addressed 3 main issues:
● Seasonal Bias
● Weekday/Weekend Bias
● Computational Efficiency
Know Thy Enemy: OEEM 3.0 Modeling Issues
12
Seasonal Bias Distribution
13
3.0
4.0
- 7.53% Bias
- 0.05% Bias
3.0 vs. 4.0: Residential Gas Winter Bias
14
3.0 vs. 4.0 Seasonal Bias: Individual Meter
15
3.0 vs. 4.0 Weekend Bias: Individual Meter
16
OpenEEmeter 3.0 = Sloooowww
● Exhaustive grid search
● 1,891 models for every meter
● 20 - 60 seconds per meter
OpenEEmeter 4.0 = Efficient
● Can replicate OpenEEmeter 3.0 at ~0.5 seconds per meter (~100x faster)
Computational Efficiency
17
OpenEEmeter 4.0:
How is it more accurate?
Identified Issues: Seasonal Bias
Occurs when behavior changes with seasons
19
Identified Issues: Seasonal Bias
Solution: Split seasons and fit models based on splits
20
Identified Issues: Weekday/Weekend Bias
Occurs when behavior changes based on type of day
Weekend
Weekday
21
Weekend
Weekday
Identified Issues: Weekday/Weekend Bias
Solution: Split on weekday/weekend
22
Identified Issues: Linear Model
Solution: Add smoothing between linear components
23
Identified Issues: Ordinary Least Squares
OLS fits can lead to non-predictive models
24
Identified Issues: Ordinary Least Squares
Solution: Adaptive, robust loss function to down-weight outliers
Standard Deviations from Mean
Loss
Response
25
OpenEEmeter 4.0:
How is it faster?
Identified Issues: Computational Efficiency
Grid search is slow compared to global optimization
Grid Search
9 evaluations
Optimization
9 evaluations
27
Identified Issues: Computational Efficiency
Grid search is slow compared to global optimization
Grid Search
25 evaluations
Optimization
9 evaluations
1891 models created 1.5 models created
28
Secret ingredient #1: Balance Point Optimization
● Initial guess: BP at 10% and 90% of data
● Use DIRECT global optimization method
Identified Issues: Computational Efficiency
Solution: Use optimization to find balance points
Initial guesses
29
Identified Issues: Computational Efficiency
Solution: Use Elastic Net to only fit one model and penalize coefficients
Ordinary Least Squares goal
● Minimize residuals
Elastic Net goal
● Minimize residuals + coefficients
30
Identified Issues: Computational Efficiency
Solution: Use Elastic Net to only fit one model and penalize coefficients
Ordinary Least Squares goal
● Minimize residuals
Elastic Net goal
● Minimize residuals + coefficients
Secret ingredient #2
31
OpenEEmeter 4.0:
How do we know when to split
How do we choose to split or not?
Strive for optimal fitting using test error
No splitting All possible
splits
33
Experimental Design Considerations
Cannot assess test error from reporting period
Bad Assumption
Reporting period = Baseline period
Wouldn’t it be nice if…
We could exclusively use baseline data and
achieve predictive testing
We need some tools!
34
Average performance
of all folds
Why do we need CV?
● Best model parameters
Where? → baseline period!
Goal? → predictive testing
Experimental Design Considerations
Can assess predictive error using cross validation
35
Cross validation
● Useful in development
● Untenable in final product → computational time
Can we approximate CV?
● Yes! Selection criterion
● Selection Criterion = SSE + penalty
● But it's meant to reduce master model
Final Model
Cross validation would be too slow for 1M buildings
36
What is the best penalization for model complexity
● Selection Criterion = SSE + penalty
(Modified Bayesian Information Criterion)
What are best parameters?
● Based on 10-fold cross validation RMSE (predictive)
● 6000 meters (4000 res gas, 1000 res elec, 1000 comm elec)
Final Model
Create a selection criterion function to select splits
37
Final Model
Improved accuracy and speed (no free lunch theorem)
38
Final Model
39
● Model Specification and Results (google: OpenEEmeter 4.0)
OpenEEmeter 4.0:
API Improvements
API Improvements
Inspired by Sklearn’s simplicity
● Sklearn manages many complex models with a simple interface
● We should do the same
cluster_algo = [
cluster.MiniBatchKMeans(),
cluster.AgglomerativeClustering(),
cluster.Birch(),
cluster.DBSCAN(),
]
for algo in cluster_algo:
algo.fit(X)
res = algo.predict(X_new)
regres_algo = [
linear_model.LinearRegression(),
linear_model.ElasticNet(),
linear_model.BayesianRidge(),
linear_model.RANSACRegressor(),
]
for algo in regres_algo:
algo.fit(X, y)
res = algo.predict(X_new)
Clustering API Regression API
Completely
different, but
almost same
API?
41
OpenEEmeter 3.0 OpenEEmeter 4.0
● Most steps copied from tutorials
(user feedback)
● Different processes for daily and
hourly modeling
● User sets options in function calls
● Intermediate information passed
between function calls
● Simple function calls:
initialize/fit/predict
● Same function calls regardless of
model
● Sensible defaults
● Intermediate information within
class
API Improvements
Goal is ease of use
42
baseline_design_matrix = create_caltrack_daily_design_matrix(
baseline_meter_data, temperature_data, degc
)
baseline_model = fit_caltrack_usage_per_day_model(baseline_design_matrix)
reporting_meter_data, warnings = get_reporting_data(
meter_data, start=blackout_end_date, max_days=365
)
metered_savings_dataframe, error_bands = metered_savings(
baseline_model,
reporting_meter_data,
temperature_data,
with_disaggregated=True,
degc=degc,
)
baseline_design_matrix = create_caltrack_billing_design_matrix(
baseline_meter_data, temperature_data, degc
)
baseline_model = fit_caltrack_usage_per_day_model(
baseline_design_matrix
use_billing_presets=True,
weights_col='n_days_kept',
)
reporting_meter_data, warnings = get_reporting_data(
meter_data, start=blackout_end_date, max_days=365
)
metered_savings_dataframe, error_bands = metered_savings(
baseline_model,
reporting_meter_data,
temperature_data,
with_disaggregated=True,
degc=degc,
)
baseline_data = DailyBaselineData(baseline_df)
reporting_data = DailyReportingData(reporting_df)
model = DailyModel(settings=None).fit(baseline_data)
result = model.predict(reporting_data)
OpenEEmeter 3.0 OpenEEmeter 4.0
baseline_data = BillingBaselineData(baseline_df)
reporting_data = BillingReportingData(reporting_df)
model = BillingModel(settings=None).fit(baseline_data)
result = model.predict(reporting_data)
Daily
Billing
Simplified Daily and Billing Models
43
# create a design matrix for occupancy and segmentation
preliminary_design_matrix = create_caltrack_hourly_preliminary_design_matrix(
baseline_meter_data, temperature_data, degc
)
# build 12 monthly models - each step from now on operates on each segment
segmentation = segment_time_series(
preliminary_design_matrix.index, "three_month_weighted"
)
# assign an occupancy status to each hour of the week (0-167)
occupancy_lookup = estimate_hour_of_week_occupancy(
preliminary_design_matrix, segmentation=segmentation
)
# assign temperatures to bins
(
occupied_temperature_bins,
unoccupied_temperature_bins,
) = fit_temperature_bins(
preliminary_design_matrix,
segmentation=segmentation,
occupancy_lookup=occupancy_lookup,
)
# build a design matrix for each monthly segment
segmented_design_matrices = create_caltrack_hourly_segmented_design_matrices(
preliminary_design_matrix,
segmentation,
occupancy_lookup,
occupied_temperature_bins,
unoccupied_temperature_bins,
)
# build a CalTRACK hourly model
baseline_model = fit_caltrack_hourly_model(
segmented_design_matrices,
occupancy_lookup,
occupied_temperature_bins,
unoccupied_temperature_bins,
)
# compute metered savings for the year of the reporting period we've selected
result, error_bands = metered_savings(
baseline_model,
reporting_meter_data,
temperature_data,
with_disaggregated=True,
degc=degc,
)
baseline_data = HourlyBaselineData(baseline_df)
reporting_data = HourlyReportingData(reporting_df)
model = HourlyModel(settings=None).fit(baseline_data)
result = model.predict(reporting_data)
OpenEEmeter 3.0 OpenEEmeter 4.0
Simplified Hourly Model
44
Data Class
Tracks disqualification and formats data for Model class
● Track all data sufficiency
● Unique for each model type
● Must be run to pass to Model
(Can bypass in model)
● Formats data for Model class
● Violations are propagated to
Model class
baseline_data = BaselineData(baseline_df)
baseline_data.disqualification
baseline_data.warnings
Disqualification -
{
'qualified_name':
'eemeter.sufficiency_criteria.too_many_days_with_missing_data',
'description': 'Too many days in data have missing meter data or
temperature data.',
'data': {'n_valid_days': 251, 'n_days_total' : 365}}
}
Warnings -
{'qualified_name':
'eemeter.sufficiency_criteria.missing_high_frequency_meter_data',
'description': 'More than 50% of the high frequency Meter data is
missing.',
'data': [Timestamp('2020-02-29 00:00:00+0000', tz='UTC')]
}
45
OpenEEmeter 4.0:
Conclusions
Conclusion
Model
● 84% less seasonal bias
● 95% less weekday/weekend bias
● Daily model is 2 - 10x faster
● Billing model is 100x faster
● Hyperparameters are broadly applicable
47
pip install eemeter
Conclusion
API
● Standard calls for all models (fit/predict)
● Data class
○ Formats data for models
○ Checks sufficiency
○ Provides disqualification reasons
48
pip install eemeter
Technical Steering Committee:
● Adam Scheer, Recurve
● McGee Young, WattCarbon
● Phil Ngo, Recurve
● Travis Sikes, Recurve
● Steve Suffian, WattCarbon
Key Contributors
● Armin Aligholian, Recurve
● Jason Chulock, Recurve
● Joydeep Nag, Recurve
● Ethan Goldman, Resilient Edge
● Matt Fawcett, Carbon Co-op
● James Fenna, Carbon Co-op
49
People
Ongoing Work: Hourly Model!
Hourly Model
● 10x faster
● Huge improvement for solar
PV customers
● More flexible
● Data class
50
Error
Improvement
%
Percent Daily Cloudiness
Solar PV Customers
https://www.caltrack.org/technical-working-group.html
Join the working group!
Questions?
Appendix
Seasonal error
profiles: sharp
features around
balance point
temperature
Space heating
initiated at warmer
outside temps in
winter
Why is there Seasonal Bias?
Identified Issues: Ordinary Least Squares
Solution: Adaptive, robust loss function to down-weight outliers
54
Standard Deviations from Mean
Loss
Response
Identified Issues: Computational Efficiency
Solution: Only fit components once
55
Secret ingredient #3: Reuse component fits
● ~40-50 possible combinations of components
● Save component fits and reuse
Identified Issues: Computational Efficiency
Solution: Eliminate potential splits through overlapping clusters
56
Secret ingredient #4

More Related Content

Similar to LF Energy Webinar - Unveiling OpenEEMeter 4.0

Sumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics MasterySumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics Mastery
Sumo Logic
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
Rising Media, Inc.
 
02 intel v_tune_session_02
02 intel v_tune_session_0202 intel v_tune_session_02
02 intel v_tune_session_02
Vivek chan
 
OR-I_Lecture_Note_01.pptx
OR-I_Lecture_Note_01.pptxOR-I_Lecture_Note_01.pptx
OR-I_Lecture_Note_01.pptx
ssuserf19f3e
 
Reactive Performance Testing
Reactive Performance TestingReactive Performance Testing
Reactive Performance Testing
Lilit Yenokyan
 
Customer choice probabilities
Customer choice probabilitiesCustomer choice probabilities
Customer choice probabilities
Allan D. Butler
 
B2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingB2 2006 sizing_benchmarking
B2 2006 sizing_benchmarking
Steve Feldman
 
B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)
Steve Feldman
 
Icpe2015 weiyi shang (1)
Icpe2015 weiyi shang (1)Icpe2015 weiyi shang (1)
Icpe2015 weiyi shang (1)
SAIL_QU
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at Humana
Databricks
 
Software testing Report
Software testing ReportSoftware testing Report
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
HostedbyConfluent
 
Java Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth RoundJava Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth Round
Annibale Panichella
 
Effective Testing Practices in an Agile Environment
Effective Testing Practices in an Agile EnvironmentEffective Testing Practices in an Agile Environment
Effective Testing Practices in an Agile Environment
Raj Indugula
 
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Mauro Vallati
 
Gatling workshop lets test17
Gatling workshop lets test17Gatling workshop lets test17
Gatling workshop lets test17
Gerald Muecke
 
Test AI/ML Applications
Test AI/ML ApplicationsTest AI/ML Applications
Test AI/ML Applications
🍻 Tarun Maini
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
Greg Makowski
 
Sumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics MasterySumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics Mastery
Sumo Logic
 
Dev buchan 30 proven tips
Dev buchan 30 proven tipsDev buchan 30 proven tips
Dev buchan 30 proven tips
Bill Buchan
 

Similar to LF Energy Webinar - Unveiling OpenEEMeter 4.0 (20)

Sumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics MasterySumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics Mastery
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 
02 intel v_tune_session_02
02 intel v_tune_session_0202 intel v_tune_session_02
02 intel v_tune_session_02
 
OR-I_Lecture_Note_01.pptx
OR-I_Lecture_Note_01.pptxOR-I_Lecture_Note_01.pptx
OR-I_Lecture_Note_01.pptx
 
Reactive Performance Testing
Reactive Performance TestingReactive Performance Testing
Reactive Performance Testing
 
Customer choice probabilities
Customer choice probabilitiesCustomer choice probabilities
Customer choice probabilities
 
B2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingB2 2006 sizing_benchmarking
B2 2006 sizing_benchmarking
 
B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)
 
Icpe2015 weiyi shang (1)
Icpe2015 weiyi shang (1)Icpe2015 weiyi shang (1)
Icpe2015 weiyi shang (1)
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at Humana
 
Software testing Report
Software testing ReportSoftware testing Report
Software testing Report
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
 
Java Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth RoundJava Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth Round
 
Effective Testing Practices in an Agile Environment
Effective Testing Practices in an Agile EnvironmentEffective Testing Practices in an Agile Environment
Effective Testing Practices in an Agile Environment
 
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
 
Gatling workshop lets test17
Gatling workshop lets test17Gatling workshop lets test17
Gatling workshop lets test17
 
Test AI/ML Applications
Test AI/ML ApplicationsTest AI/ML Applications
Test AI/ML Applications
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
 
Sumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics MasterySumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Metrics Mastery
 
Dev buchan 30 proven tips
Dev buchan 30 proven tipsDev buchan 30 proven tips
Dev buchan 30 proven tips
 

More from DanBrown980551

5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
LF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIELF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIE
DanBrown980551
 
Power Grid Model Workshop - 18 January 2024
Power Grid Model Workshop - 18 January 2024Power Grid Model Workshop - 18 January 2024
Power Grid Model Workshop - 18 January 2024
DanBrown980551
 
Building an EV Charging Reference Implementation with EVerest.pptx
Building an EV Charging Reference Implementation with EVerest.pptxBuilding an EV Charging Reference Implementation with EVerest.pptx
Building an EV Charging Reference Implementation with EVerest.pptx
DanBrown980551
 
LF Energy Power Grid Model Meetup December 2023
LF Energy Power Grid Model Meetup December 2023LF Energy Power Grid Model Meetup December 2023
LF Energy Power Grid Model Meetup December 2023
DanBrown980551
 

More from DanBrown980551 (6)

5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
LF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIELF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIE
 
Power Grid Model Workshop - 18 January 2024
Power Grid Model Workshop - 18 January 2024Power Grid Model Workshop - 18 January 2024
Power Grid Model Workshop - 18 January 2024
 
Building an EV Charging Reference Implementation with EVerest.pptx
Building an EV Charging Reference Implementation with EVerest.pptxBuilding an EV Charging Reference Implementation with EVerest.pptx
Building an EV Charging Reference Implementation with EVerest.pptx
 
LF Energy Power Grid Model Meetup December 2023
LF Energy Power Grid Model Meetup December 2023LF Energy Power Grid Model Meetup December 2023
LF Energy Power Grid Model Meetup December 2023
 

Recently uploaded

AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 

Recently uploaded (20)

AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 

LF Energy Webinar - Unveiling OpenEEMeter 4.0

  • 2. Antitrust Policy Notice Linux Foundation meetings involve participation by industry competitors, and it is the intention of the Linux Foundation to conduct all of its activities in accordance with applicable antitrust and competition laws. It is therefore extremely important that attendees adhere to meeting agendas, and be aware of, and not participate in, any activities that are prohibited under applicable US state, federal or foreign antitrust and competition laws. Examples of types of actions that are prohibited at Linux Foundation meetings and in connection with Linux Foundation activities are described in the Linux Foundation Antitrust Policy available at linuxfoundation.org/antitrust-policy. If you have questions about these matters, please contact your company counsel, or if you are a member of the Linux Foundation, feel free to contact Andrew Updegrove of the firm of Gesmer Updegrove LLP, which provides legal counsel to the Linux Foundation.
  • 3. ● Purpose and Brief History ● Methods Review ● Issues and Key Results ● Methods Advancements (The How) ○ Accuracy ○ Speed ● API Improvements Agenda
  • 5. Establish “weights and measures” for demand side programs Enable our industry to compete at scale, including against supply side options Remove measurement barriers to integrated programs Purpose
  • 6. … And Many Others The Work of …
  • 7. 2012/2013 “CalTRACK” methods development initiated to calibrate building software tools 2017 OpenEEmeter 3.0: - Daily Improvements - Hourly Methods OpenEEmeter Timeline 2016 OpenEEmeter 1.0: - Monthly Methods - Daily Methods - OpenEEmeter 2019 OpenEEmeter joined LF Energy as open source project 7 2024 OpenEEmeter 4.0: - New Daily Model - Vastly Improved API
  • 8. The industry is changing fast - Energy - Utility - Demand Side Program We need measurement capabilities that enable, not inhibit, modern programs Times They Are A-Changin’ 8
  • 11. Balance Point Temp Temp-Independent Linear (HDD) OpenEEmeter Daily 11
  • 12. We’ve addressed 3 main issues: ● Seasonal Bias ● Weekday/Weekend Bias ● Computational Efficiency Know Thy Enemy: OEEM 3.0 Modeling Issues 12
  • 14. 3.0 4.0 - 7.53% Bias - 0.05% Bias 3.0 vs. 4.0: Residential Gas Winter Bias 14
  • 15. 3.0 vs. 4.0 Seasonal Bias: Individual Meter 15
  • 16. 3.0 vs. 4.0 Weekend Bias: Individual Meter 16
  • 17. OpenEEmeter 3.0 = Sloooowww ● Exhaustive grid search ● 1,891 models for every meter ● 20 - 60 seconds per meter OpenEEmeter 4.0 = Efficient ● Can replicate OpenEEmeter 3.0 at ~0.5 seconds per meter (~100x faster) Computational Efficiency 17
  • 18. OpenEEmeter 4.0: How is it more accurate?
  • 19. Identified Issues: Seasonal Bias Occurs when behavior changes with seasons 19
  • 20. Identified Issues: Seasonal Bias Solution: Split seasons and fit models based on splits 20
  • 21. Identified Issues: Weekday/Weekend Bias Occurs when behavior changes based on type of day Weekend Weekday 21
  • 22. Weekend Weekday Identified Issues: Weekday/Weekend Bias Solution: Split on weekday/weekend 22
  • 23. Identified Issues: Linear Model Solution: Add smoothing between linear components 23
  • 24. Identified Issues: Ordinary Least Squares OLS fits can lead to non-predictive models 24
  • 25. Identified Issues: Ordinary Least Squares Solution: Adaptive, robust loss function to down-weight outliers Standard Deviations from Mean Loss Response 25
  • 27. Identified Issues: Computational Efficiency Grid search is slow compared to global optimization Grid Search 9 evaluations Optimization 9 evaluations 27
  • 28. Identified Issues: Computational Efficiency Grid search is slow compared to global optimization Grid Search 25 evaluations Optimization 9 evaluations 1891 models created 1.5 models created 28
  • 29. Secret ingredient #1: Balance Point Optimization ● Initial guess: BP at 10% and 90% of data ● Use DIRECT global optimization method Identified Issues: Computational Efficiency Solution: Use optimization to find balance points Initial guesses 29
  • 30. Identified Issues: Computational Efficiency Solution: Use Elastic Net to only fit one model and penalize coefficients Ordinary Least Squares goal ● Minimize residuals Elastic Net goal ● Minimize residuals + coefficients 30
  • 31. Identified Issues: Computational Efficiency Solution: Use Elastic Net to only fit one model and penalize coefficients Ordinary Least Squares goal ● Minimize residuals Elastic Net goal ● Minimize residuals + coefficients Secret ingredient #2 31
  • 32. OpenEEmeter 4.0: How do we know when to split
  • 33. How do we choose to split or not? Strive for optimal fitting using test error No splitting All possible splits 33
  • 34. Experimental Design Considerations Cannot assess test error from reporting period Bad Assumption Reporting period = Baseline period Wouldn’t it be nice if… We could exclusively use baseline data and achieve predictive testing We need some tools! 34
  • 35. Average performance of all folds Why do we need CV? ● Best model parameters Where? → baseline period! Goal? → predictive testing Experimental Design Considerations Can assess predictive error using cross validation 35
  • 36. Cross validation ● Useful in development ● Untenable in final product → computational time Can we approximate CV? ● Yes! Selection criterion ● Selection Criterion = SSE + penalty ● But it's meant to reduce master model Final Model Cross validation would be too slow for 1M buildings 36
  • 37. What is the best penalization for model complexity ● Selection Criterion = SSE + penalty (Modified Bayesian Information Criterion) What are best parameters? ● Based on 10-fold cross validation RMSE (predictive) ● 6000 meters (4000 res gas, 1000 res elec, 1000 comm elec) Final Model Create a selection criterion function to select splits 37
  • 38. Final Model Improved accuracy and speed (no free lunch theorem) 38
  • 39. Final Model 39 ● Model Specification and Results (google: OpenEEmeter 4.0)
  • 41. API Improvements Inspired by Sklearn’s simplicity ● Sklearn manages many complex models with a simple interface ● We should do the same cluster_algo = [ cluster.MiniBatchKMeans(), cluster.AgglomerativeClustering(), cluster.Birch(), cluster.DBSCAN(), ] for algo in cluster_algo: algo.fit(X) res = algo.predict(X_new) regres_algo = [ linear_model.LinearRegression(), linear_model.ElasticNet(), linear_model.BayesianRidge(), linear_model.RANSACRegressor(), ] for algo in regres_algo: algo.fit(X, y) res = algo.predict(X_new) Clustering API Regression API Completely different, but almost same API? 41
  • 42. OpenEEmeter 3.0 OpenEEmeter 4.0 ● Most steps copied from tutorials (user feedback) ● Different processes for daily and hourly modeling ● User sets options in function calls ● Intermediate information passed between function calls ● Simple function calls: initialize/fit/predict ● Same function calls regardless of model ● Sensible defaults ● Intermediate information within class API Improvements Goal is ease of use 42
  • 43. baseline_design_matrix = create_caltrack_daily_design_matrix( baseline_meter_data, temperature_data, degc ) baseline_model = fit_caltrack_usage_per_day_model(baseline_design_matrix) reporting_meter_data, warnings = get_reporting_data( meter_data, start=blackout_end_date, max_days=365 ) metered_savings_dataframe, error_bands = metered_savings( baseline_model, reporting_meter_data, temperature_data, with_disaggregated=True, degc=degc, ) baseline_design_matrix = create_caltrack_billing_design_matrix( baseline_meter_data, temperature_data, degc ) baseline_model = fit_caltrack_usage_per_day_model( baseline_design_matrix use_billing_presets=True, weights_col='n_days_kept', ) reporting_meter_data, warnings = get_reporting_data( meter_data, start=blackout_end_date, max_days=365 ) metered_savings_dataframe, error_bands = metered_savings( baseline_model, reporting_meter_data, temperature_data, with_disaggregated=True, degc=degc, ) baseline_data = DailyBaselineData(baseline_df) reporting_data = DailyReportingData(reporting_df) model = DailyModel(settings=None).fit(baseline_data) result = model.predict(reporting_data) OpenEEmeter 3.0 OpenEEmeter 4.0 baseline_data = BillingBaselineData(baseline_df) reporting_data = BillingReportingData(reporting_df) model = BillingModel(settings=None).fit(baseline_data) result = model.predict(reporting_data) Daily Billing Simplified Daily and Billing Models 43
  • 44. # create a design matrix for occupancy and segmentation preliminary_design_matrix = create_caltrack_hourly_preliminary_design_matrix( baseline_meter_data, temperature_data, degc ) # build 12 monthly models - each step from now on operates on each segment segmentation = segment_time_series( preliminary_design_matrix.index, "three_month_weighted" ) # assign an occupancy status to each hour of the week (0-167) occupancy_lookup = estimate_hour_of_week_occupancy( preliminary_design_matrix, segmentation=segmentation ) # assign temperatures to bins ( occupied_temperature_bins, unoccupied_temperature_bins, ) = fit_temperature_bins( preliminary_design_matrix, segmentation=segmentation, occupancy_lookup=occupancy_lookup, ) # build a design matrix for each monthly segment segmented_design_matrices = create_caltrack_hourly_segmented_design_matrices( preliminary_design_matrix, segmentation, occupancy_lookup, occupied_temperature_bins, unoccupied_temperature_bins, ) # build a CalTRACK hourly model baseline_model = fit_caltrack_hourly_model( segmented_design_matrices, occupancy_lookup, occupied_temperature_bins, unoccupied_temperature_bins, ) # compute metered savings for the year of the reporting period we've selected result, error_bands = metered_savings( baseline_model, reporting_meter_data, temperature_data, with_disaggregated=True, degc=degc, ) baseline_data = HourlyBaselineData(baseline_df) reporting_data = HourlyReportingData(reporting_df) model = HourlyModel(settings=None).fit(baseline_data) result = model.predict(reporting_data) OpenEEmeter 3.0 OpenEEmeter 4.0 Simplified Hourly Model 44
  • 45. Data Class Tracks disqualification and formats data for Model class ● Track all data sufficiency ● Unique for each model type ● Must be run to pass to Model (Can bypass in model) ● Formats data for Model class ● Violations are propagated to Model class baseline_data = BaselineData(baseline_df) baseline_data.disqualification baseline_data.warnings Disqualification - { 'qualified_name': 'eemeter.sufficiency_criteria.too_many_days_with_missing_data', 'description': 'Too many days in data have missing meter data or temperature data.', 'data': {'n_valid_days': 251, 'n_days_total' : 365}} } Warnings - {'qualified_name': 'eemeter.sufficiency_criteria.missing_high_frequency_meter_data', 'description': 'More than 50% of the high frequency Meter data is missing.', 'data': [Timestamp('2020-02-29 00:00:00+0000', tz='UTC')] } 45
  • 47. Conclusion Model ● 84% less seasonal bias ● 95% less weekday/weekend bias ● Daily model is 2 - 10x faster ● Billing model is 100x faster ● Hyperparameters are broadly applicable 47 pip install eemeter
  • 48. Conclusion API ● Standard calls for all models (fit/predict) ● Data class ○ Formats data for models ○ Checks sufficiency ○ Provides disqualification reasons 48 pip install eemeter
  • 49. Technical Steering Committee: ● Adam Scheer, Recurve ● McGee Young, WattCarbon ● Phil Ngo, Recurve ● Travis Sikes, Recurve ● Steve Suffian, WattCarbon Key Contributors ● Armin Aligholian, Recurve ● Jason Chulock, Recurve ● Joydeep Nag, Recurve ● Ethan Goldman, Resilient Edge ● Matt Fawcett, Carbon Co-op ● James Fenna, Carbon Co-op 49 People
  • 50. Ongoing Work: Hourly Model! Hourly Model ● 10x faster ● Huge improvement for solar PV customers ● More flexible ● Data class 50 Error Improvement % Percent Daily Cloudiness Solar PV Customers https://www.caltrack.org/technical-working-group.html Join the working group!
  • 53. Seasonal error profiles: sharp features around balance point temperature Space heating initiated at warmer outside temps in winter Why is there Seasonal Bias?
  • 54. Identified Issues: Ordinary Least Squares Solution: Adaptive, robust loss function to down-weight outliers 54 Standard Deviations from Mean Loss Response
  • 55. Identified Issues: Computational Efficiency Solution: Only fit components once 55 Secret ingredient #3: Reuse component fits ● ~40-50 possible combinations of components ● Save component fits and reuse
  • 56. Identified Issues: Computational Efficiency Solution: Eliminate potential splits through overlapping clusters 56 Secret ingredient #4