Real-time Ranking of Electrical Feeders using Expert Advice
Upcoming SlideShare
Loading in...5
×
 

Real-time Ranking of Electrical Feeders using Expert Advice

on

  • 1,177 views

Hila Becker, Marta Arias, "Real-time Ranking of Electrical Feeders using Expert Advice", European Workshop on Data Stream Analysis 2007

Hila Becker, Marta Arias, "Real-time Ranking of Electrical Feeders using Expert Advice", European Workshop on Data Stream Analysis 2007

Statistics

Views

Total Views
1,177
Views on SlideShare
1,170
Embed Views
7

Actions

Likes
0
Downloads
2
Comments
0

3 Embeds 7

http://www.linkedin.com 5
http://www.slideshare.net 1
https://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Generation: a prime mover, typically the force of water, steam, or hot gasses on a turbine, spins an electromagnet, generating large amounts of electrical current at a generating station Transmission: the current is sent at very high voltage (hundreds of thousands of volts) from the generating station to substations closer to the customers Primary Distribution: electricity is sent at mid-level voltage (tens of thousands of volts) from substations to local transformers , over cables called feeders, usually 10-20 km long, and with a few tens of transformers per feeder. Feeders are composed of many feeder sections connected by joints and splices Secondary Distribution: sends electricity at normal household voltages from local transformers to individual customers
  • Many occur in the summer when the load on the system increases, during heat waves When an O/A occurs, the load that had been carried by the failed feeder must shift to adjacent feeders, further stressing them. This can lead to a failure cascade in a distribution network
  • (e.g. age and composition of each feeder section) and e.g. electrical load data for a feeder and its transformers, accumulating at a rate of several hundred megabytes per day) Software engineering challenges to manage data
  • with sufficient accuracy so that timely preventive maintenance can be taken on the right feeders at the right time.
  • How to deal with impending failures that are corrected on the basis of our ranking
  • Use data within a window, aggregate dynamic data within that period in various ways (quantiles, counts, sums, averages, etc.) Re-train daily, or weekly, or every 2 weeks, or monthly, or…
  • Use data within a window, aggregate dynamic data within that period in various ways (quantiles, counts, sums, averages, etc.) Re-train daily, or weekly, or every 2 weeks, or monthly, or…
  • No human intervention needed Changes in system are learned as it runs
  • Every 7 days Seems to achieve balance of generating new models to adapt to changing conditions without overflowing system
  • if a model was successful in the previous summer but was retired during the winter, it should be rescued back in the upcoming summer if similar environmental conditions reappear.

Real-time Ranking of Electrical Feeders using Expert Advice Real-time Ranking of Electrical Feeders using Expert Advice Presentation Transcript

  • Real-time Ranking of Electrical Feeders using Expert Advice Hila Becker 1,2 , Marta Arias 1 1 Center for Computational Learning Systems 2 Computer Science Department Columbia University
  • Overview
    • The problem
      • The electrical system
      • Available data
    • Approach
    • Challenges
    • Our solution using Online learning
    • Experimental results
  • The Electrical System 1. Generation 2. Transmission 3. Primary Distribution 4. Secondary Distribution
  • The Problem
    • Distribution feeder failures result in automatic feeder shutdown
      • called “Open Autos” or O/As
    • O/As stress networks, control centers, and field crews
    • O/As are expensive ($ millions annually)
    • Proactive replacement is much cheaper and safer than reactive repair
    • How do we know which feeders to fix?
  • Some facts about feeders and failures
    • mostly 0-5 failures per day
    • more in the summer
    • strong seasonality effects
  • Feeder data
    • Static data
      • Compositional/structural
      • Electrical
    • Dynamic data
      • Outage history (updated daily)
      • Load measurements (updated every 15 minutes)
    • Roughly 200 attributes for each feeder
      • New ones are still being added.
  • Overview
    • The problem
      • The electrical system
      • Available data
    • Approach
    • Challenges
    • Our solution using Online learning
    • Experimental results
  • Machine Learning Approach
    • Leverage Con Edison’s domain knowledge and resources
    • Learn to rank feeders based on failure susceptibility
    • How?
      • Assemble data
      • Train ranking model based on past data
      • Re-rank frequently using model on current data
  • Feeder Ranking Application
    • Goal: rank feeders according to failure susceptibility
      • High risk placed near the top
    • Integrate different types of data
    • Interface that reflects the latest state of the system
      • Update feeder ranking every 15 min.
  • Application Structure Static data SQL Server DB ML Engine ML Models Rankings Decision Support GUI Action Driver Action Tracker Decision Support App Outage data Xfmr Stress data Feeder Load data
  • Decision Support GUI
  • Overview
    • The problem
      • The electrical system
      • Available data
    • Approach
    • Challenges
    • Our solution using Online learning
    • Experimental results
  • Simple Solution
    • Supervised batch-learning algorithms
      • Use past data to train a model
      • Re-rank frequently using this model on current data
    • Use the best performing learning algorithm
      • How do we measure performance?
      • MartiRank - boosting algorithm by [Long & Servedio, 2005]
        • Use MartiRank for dynamic feeder ranking
  • Performance Metric
    • Normalized average rank of failed feeders
  • Performance Metric Example ranking outages 0 8 0 7 0 6 1 5 0 4 1 3 1 2 0 1
  • Real-time ranking with MartiRank
    • MartiRank is a “ batch ” learning algorithm
    • Deal with changing system by:
      • generating new datasets with latest data
      • Re-training new model, replacing old model
      • Using newest model to generate ranking
    • Must implement “training strategies”
  • Real-time ranking with MartiRank time time time
  • How to measure performance over time
    • Every ~15 minutes, generate new ranking based on current model and latest data
    • Whenever there is a failure, look up its rank in the latest ranking before the failure
    • After a whole day, compute normalized average rank
  • Using MartiRank for real-time ranking of feeders
    • MartiRank seems to work well, but..
      • User decides when to re-train
      • User decides how much data to use for re-training
      • Performance degrades over time
    • Want to make system automatic
      • Do not discard all old models
      • Let the system decide which models to use
  • Overview
    • The problem
      • The electrical system
      • Available data
    • Approach
    • Challenges
    • Our solution using Online learning
    • Experimental results
  • Learning from expert advice
    • Consider each model as an expert
    • Each expert has associated weight
      • Reward/penalize experts with good/bad predictions
      • Weight is a measure of confidence in expert’s prediction
    • Predict using weighted average of top-scoring experts
  • Learning from expert advice
    • Advantages
      • Fully automatic
      • Adaptive
      • Can use many types of underlying learning algorithms
      • Good performance guarantees from learning theory: performance never too far off from best expert in hindsight
    • Disadvantages
      • Computational cost: need to track many models “in parallel”
  • Weighted Majority Algorithm [Littlestone & Warmuth ‘88]
    • Introduced for binary classification
      • Experts make predictions in [0,1]
      • Obtain losses in [0,1]
    • Pseudocode:
      • Learning rate as main parameter, ß in (0,1]
      • There are N “experts”, initially weight is 1 for all
      • For t=1,2,3, …
        • Predict using weighted average of experts’ prediction
        • Obtain “true” label; each expert incurs loss l i
        • Update experts’ weights using w i,t+1 = w i,t • pow(ß,l i )
  • Weighted Majority Algorithm e 1 . . . e 2 e 3 e N N Experts w 1 w 2 w 3 w N 1 0 0 1 ? w 1 *1 + w 2 *0 + w 3 *0 + . . . + w N *1 >0.5 <0.5 1 0 1
    • Calculate li = loss(i) for i=1,…,N
    • Update: wi,t+1 = wi,t • pow(ß,li) for learning rate ß, time t and i=1,..,N
  • In our case, can’t use WM directly
    • Use ranking as opposed to binary classification
    • More importantly, do not have a fixed set of experts
  • Dealing with ranking vs. binary classification
    • Ranking loss as normalized average rank of failures as seen before, loss in [0,1]
    • To combine rankings, use a weighted average of feeders’ ranks
  • Dealing with a moving set of experts
    • Introduce new parameters
      • B: “budget” (max number of models) set to 100
      • p: new models weight percentile in [0,100]
      •  : age penalty in (0,1]
    • If too many models (more than B), drop models with poor q-score, where
      • q i = w i • pow(  , age i )
      • I.e.,  is rate of exponential decay
  • Online Ranking Algorithm e 1 . . . e 2 e 3 e B w 1 w 2 w 3 w B ? F1 F4 F3 F2 F5 F4 F2 F1 F3 F5 F1 F3 F5 F4 F2 F1 F3 F4 F2 F5 F1 F3 F4 F2 F5 F3 F1 F4 F2 F5 e B+1 e B+2 w B+1 w B+2
  • Other parameters
    • How often do we train and add new models?
      • Hand-tuned over the course of the summer
      • Alternatively, one could train when observed performance drops .. not used yet
    • How much data do we use to train models?
      • Based on observed performance and early experiments
        • 1 week worth of data, and
        • 2 weeks worth of data
  • Overview
    • The problem
      • The electrical system
      • Available data
    • Approach
    • Challenges
    • Our solution using Online learning
    • Experimental results
  • Experimental Comparison
    • Compare our approach to
      • Using “batch”-trained models
      • Other online learning methods
    • Ranking Perceptron
      • Online version
    • Hand Picked Model
      • Tuned by humans with domain knowledge
  • Performance – Summer 2005
  • Performance – Winter 2006
  • Parameter Variation - Budget
  • Future Work
    • Concept drift detection
      • Add new models only when change is detected
    • Ensemble diversity control
    • Exploit re-occurring contexts