Predicting Electricity Distribution Feeder Failures using Machine Learning - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Predicting Electricity Distribution Feeder Failures using Machine Learning

Description:

Roger Anderson. Computer Science: Philip Gross. Rocco Servedio. Gail Kaiser. Samit Jain ... Luis Alonso. Joey Fortuna. Chris Murphy. Stats: Samantha Cook ... – PowerPoint PPT presentation

Number of Views:230
Avg rating:3.0/5.0
Slides: 46
Provided by: Mar1358
Category:

less

Transcript and Presenter's Notes

Title: Predicting Electricity Distribution Feeder Failures using Machine Learning


1
Predicting Electricity Distribution Feeder
Failures using Machine Learning
  • Marta Arias 1, Hila Becker 1,2
  • 1Center for Computational Learning Systems
  • 2Computer Science
  • Columbia University
  • LEARNING 06

2
Overview of the Talk
  • Introduction to the Electricity Distribution
    Network of New York City
  • What are we doing and why?
  • Early solution using MartiRank, a boosting-like
    algorithm for ranking
  • Current solution using Online learning
  • Related projects

3
Overview of the Talk
  • Introduction to the Electricity Distribution
    Network of New York City
  • What are we doing and why?
  • Early solution using MartiRank, a boosting-like
    algorithm for ranking
  • Current solution using Online learning
  • Related projects

4
The Electrical System
5
Electricity Distribution Feeders
6
Problem
  • Distribution feeder failures result in automatic
    feeder shutdown
  • called Open Autos or O/As
  • O/As stress networks, control centers, and field
    crews
  • O/As are expensive ( millions annually)
  • Proactive replacement is much cheaper and safer
    than reactive repair

7
Our Solution Machine Learning
  • Leverage Con Edisons domain knowledge and
    resources
  • Learn to rank feeders based on susceptibility to
    failure
  • How?
  • Assemble data
  • Train model based on past data
  • Re-rank frequently using model on current data

8
New York City
9
Some facts about feeders and failures
  • About 950 feeders
  • 568 in Manhattan
  • 164 in Brooklyn
  • 115 in Queens
  • 94 in the Bronx

10
Some facts about feeders and failures
  • About 60 of feeders failed at least once
  • On average, feeders failed 4.4 times
  • (between June 2005 and August 2006)

11
Some facts about feeders and failures
  • mostly 0-5 failures per day
  • more in the summer
  • strong seasonality effects

12
Feeder data
  • Static data
  • Compositional/structural
  • Electrical
  • Dynamic data
  • Outage history (updated daily)
  • Load measurements (updated every 5 minutes)
  • Roughly 200 attributes for each feeder
  • New ones are still being added.

13
Feeder Ranking Application
  • Goal rank feeders according to likelihood to
    failure (if high risk place near the top)
  • Application needs to integrate all types of data
  • Application needs to react and adapt to incoming
    dynamic data
  • Hence, update feeder ranking every 15 min.

14
Application Structure
15
Goal rank feeders according to likelihood to
failure
16
Overview of the Talk
  • Introduction to the Electricity Distribution
    Network of New York City
  • What are we doing and why?
  • Early solution using MartiRank, a boosting-like
    algorithm for ranking
  • Pseudo ROC and pseudo AUC
  • MartiRank
  • Performance metric
  • Early results
  • Current solution using Online learning
  • Related projects

17
(pseudo) ROC
outages
feeders
sorted by score
18
(pseudo) ROC
210
Number of outages
941
Number of feeders
19
(pseudo) ROC
1
Fractionof outages
Area under the ROC curve
1
Fraction of feeders
20
Some observations about the (p)ROC
  • Adapted to positive labels (not just 0/1)
  • Best pAUC is not always 1 (actually it almost
    never is..)
  • E.g. pAUC 11/15 0.73
  • Best pAUC with this data is 14/15 0.93
  • corresponding to ranking 21000

outages
ranking
21
MartiRank
  • Boosting-like algorithm by Long Servedio,
    2005
  • Greedy, maximizes pAUC at each round
  • Adapted to ranking
  • Weak learners are sorting rules
  • Each attribute is a sorting rule
  • Attributes are numerical only
  • If categorical, then convert to indicator vector
    of 0/1

22
MartiRank
divide list in two split outages evenly
divide list in three split outages evenly
feeder list begins in random order
continue
sort list by best variable
choose separate best variables for each part,
sort
choose separate best variables for each part,
sort
23
MartiRank
  • Advantages
  • Fast, easy to implement
  • Interpretable
  • Only 1 tuning parameter nr of rounds
  • Disadvantages
  • 1 tuning parameter nr of rounds
  • Was set to 4 manually..

24
Using MartiRank for real-time ranking of feeders
  • MartiRank is a batch algorithm, hence must deal
    with changing system by
  • Continually generate new datasets with latest
    data
  • Use data within a window, aggregate dynamic data
    within that period in various ways (quantiles,
    counts, sums, averages, etc.)
  • Re-train new model, throw out old model
  • Seasonality effects not taken into account
  • Use newest model to generate ranking
  • Must implement training strategies
  • Re-train daily, or weekly, or every 2 weeks, or
    monthly, or

25
Performance Metric
  • Normalized average rank of failed feeders
  • Closely related to (pseudo) Area-Under-ROC-Curve
    when labels are 0/1
  • avgRank pAUC 1 / examples
  • Essentially, difference comes from 0-based pAUC
    to 1-based ranks

26
Performance Metric Example
ranking
outages
pAUC17/240.7
27
How to measure performance over time
  • Every 15 minutes, generate new ranking based on
    current model and latest data
  • Whenever there is a failure, look up its rank in
    the latest ranking before the failure
  • After a whole day, compute normalized average rank

28
MartiRank Comparison training every 2 weeks
29
Using MartiRank for real-time ranking of feeders
  • MartiRank seems to work well, but..
  • User decides when to re-train
  • User decides how much data to use for re-training
  • . and other things like setting parameters,
    selecting algorithms, etc.
  • Want to make system 100 automatic!
  • Idea
  • Still use MartiRank since it works well with this
    data, but keep/re-use all models

30
Overview of the Talk
  • Introduction to the Electricity Distribution
    Network of New York City
  • What are we doing and why?
  • Early solution using MartiRank, a boosting-like
    algorithm for ranking
  • Current solution using Online learning
  • Overview of learning from expert advice and the
    Weighted Majority Algorithm
  • New challenges in our setting and our solution
  • Results
  • Related projects

31
Learning from expert advice
  • Consider each model as an expert
  • Each expert has associated weight (or score)
  • Reward/penalize experts with good/bad predictions
  • Weight is a measure of confidence in experts
    prediction
  • Predict using weighted average of top-scoring
    experts

32
Learning from expert advice
  • Advantages
  • Fully automatic
  • No human intervention needed
  • Adaptive
  • Changes in system are learned as it runs
  • Can use many types of underlying learning
    algorithms
  • Good performance guarantees from learning theory
    performance never too far off from best expert in
    hindsight
  • Disadvantages
  • Computational cost need to track many models in
    parallel
  • Models are harder to interpret

33
Weighted Majority Algorithm Littlestone
Warmuth 88
  • Introduced for binary classification
  • Experts make predictions in 0,1
  • Obtain losses in 0,1
  • Pseudocode
  • Learning rate as main parameter, ß in (0,1
  • There are N experts, initially weight is 1 for
    all
  • For t1,2,3,
  • Predict using weighted average of each experts
    prediction
  • Obtain true label each expert incurs loss li
  • Update experts weights using wi,t1 wi,t
    pow(ß,li)

34
In our case, cant use WM directly
  • Use ranking as opposed to binary classification
  • More importantly, do not have a fixed set of
    experts

35
Dealing with ranking vs. binary classification
  • Ranking loss as normalized average rank of
    failures as seen before, loss in 0,1
  • To combine rankings, use a weighted average of
    feeders ranks

36
Dealing with a moving set of experts
  • Introduce new parameters
  • B budget (max number of models) set to 100
  • p new models weight percentile in 0,100
  • ? age penalty in (0,1
  • When training new models, add to set of models
    with weight corresponding to pth percentile
    (among current weights)
  • If too many models (more than B), drop models
    with poor q-score, where
  • qi wi pow(?, agei)
  • I.e., ? is rate of exponential decay

37
Other parameters
  • How often do we train and add new models?
  • Hand-tuned over the course of the summer
  • Every 7 days
  • Seems to achieve balance of generating new models
    to adapt to changing conditions without
    overflowing system
  • Alternatively, one could train when observed
    performance drops .. not used yet
  • How much data do we use to train models?
  • Based on observed performance and early
    experiments
  • 1 week worth of data, and
  • 2 weeks worth of data

38
Performance
39
Failures rank distribution
Summer 2005
Autumn 2005
Spring 2006
Winter 2005-06
Summer 2006
40
Daily average rank of failures
Summer 2005
Autumn 2005
Winter 2005-06
Spring 2006
Summer 2006
41
Other things that I have not talked about but
took a significant amount of time
  • DATA
  • Data is spread over many repositories.
  • Difficult to identify useful data
  • Difficult to arrange access to data
  • Volume of data.
  • Gigabytes of data accumulated on a daily basis.
  • Required optimized database layout and the
    addition of a preprocessing stage
  • Had to gain understanding of data semantics
  • Software Engineering (this is a deployed
    application)

42
Current Status
  • Summer 2006 System has has been debugged,
    fine-tuned, tested and deployed
  • Now fully operational
  • Ready to be used next summer (in test mode)
  • After this summer, were going to do systematic
    studies of
  • Parameter sensitivity
  • Comparisons to other approaches

43
Related work-in-progress
  • Online learning
  • Fancier weight updates with better guaranteed
    performance in changing environments
  • Explore direct online ranking strategies (e.g.
    the ranking perceptron)
  • Datamining project
  • Aims to exploit seasonality
  • Learn mapping from environmental conditions to
    good performing experts characteristics
  • When same conditions arise in the future,
    increase weights of experts that have those
    characteristics
  • Hope to learn it as system runs, continually
    updating mappings
  • MartiRank
  • In presence of repeated/missing values, sorting
    is non-deterministic and pAUC takes different
    values depending on permutation of data
  • Use statistics of the pAUC to improve basic
    learning algorithm
  • Instead of input nr of rounds, stop when AUC
    increase is not significant
  • Use better estimators of pAUC that are not
    sensitive to permutations of the data

44
Other related projects within collaboration with
Con Edison
  • Finer-grained component analysis
  • Ranking of transformers
  • Ranking of cable sections
  • Ranking of cable joints
  • Merging of all systems into one
  • Mixing ML and Survival Analysis

45
Acknowledgments
  • Columbia
  • CCLS
  • Wei Chu
  • Martin Jansche
  • Ansaf Salleb
  • Albert Boulanger
  • David Waltz
  • Philip M. Long (now at Google)
  • Roger Anderson
  • Computer Science
  • Philip Gross
  • Rocco Servedio
  • Gail Kaiser
  • Samit Jain
  • John Ioannidis
  • Sergey Sigelman
  • Luis Alonso
  • Joey Fortuna
  • Chris Murphy
  • Con Edison
  • Matthew Koenig
  • Mark Mastrocinque
  • William Fairechio
  • John A. Johnson
  • Serena Lee
  • Charles Lawson
  • Frank Doherty
  • Arthur Kressner
  • Matt Sniffen
  • Elie Chebli
  • George Murray
  • Bill McGarrigle
  • Van Nest team
Write a Comment
User Comments (0)
About PowerShow.com