A Methodology Using Support Vector Machines for Shortterm Load Forecasting - PowerPoint PPT Presentation

1 / 40

About This Presentation

Title:

A Methodology Using Support Vector Machines for Shortterm Load Forecasting

Description:

Used by Power Utilities (e.g.,ComEd, An Exelon Company) 8. Empirical Risk Minimization (ERM) ... Real world data were provided by ComEd (An Exelon Company) ... – PowerPoint PPT presentation

Number of Views:184

Avg rating:3.0/5.0

Slides: 41

Provided by: aqua93

Category:

more less

Transcript and Presenter's Notes

Title: A Methodology Using Support Vector Machines for Shortterm Load Forecasting

1
A Methodology Using Support Vector Machines for
Short-term Load Forecasting
2
Outline

Introduction Load forecasting problem
Existing models and approaches
Support Vector Machines
Implementation of Support Vector Methodology
Comparison with other approaches
Results and conclusions

3
Objectives of this research

To investigate the applicability of Support
Vector Machines (SVM) methodology for short-term
load forecasting
To obtain comparisons of the SVM with existing
approaches
To implement advances in model selection
approaches in order to improve short-term load
forecasts

4
Electric Power Grid

Complex interactive network
Vulnerable to cascading failures
Extremely complex behavior
Multi-scale time hierarchy

5
Load Forecasting Problem
How to improve the power grid operation?

Agent-based anticipatory distributed control
Robust adaptive and reconfigurable management
Load forecasting for scheduling of generating
capacities, system security assessments, and
planning

The quality of short term hourly load forecasts
can improve the efficiency of operation of many
electric utilities
6
Forecasting Methods

Expert Judgments
Linear Models
Linear Regression
Ridge Regression
Nonlinear Models
Artificial Neural Networks
Nonlinear Regression
Support Vector Machines

7
Existing Models
Used by Power Utilities (e.g.,ComEd, An Exelon
Company)

Classical forecasting scheme (expert judgments )
ANNSTLF
Artificial Neural Network Short-Term Load
Forecaster
Others

8
Empirical Risk Minimization (ERM)
Learning machine
Generator of samples
x
System

To minimize the error on the training sample with
the expectation that this will give the best
result in the future

The empirical risk can be reduced to 0 if L(z,a)
has sufficient capacity
9
Consistency of ERM principle

The empirical risk uniformly converges to the
actual (true) risk functional as

Necessary and sufficient condition for the
consistency of ERM principle
10
Linear Models

Gives Ordinary Least Squares Solution
Completely developed theory

It fails when the data is substantially nonlinear
When data has severe collinearity the
regularization is required ( Ridge regression )
Performs really well in many cases

11
Neural Models

Nonlinear regression/classification
Supervised training
Back-propagation
Multiple minima problem
Slow rate of convergence
Variety of heuristic approaches tested

Activation function
12
Neural Networks

Perform nonlinear optimization
Inherently ill-posed problem
The set of approximating functions is limited by
back-propagation training
Final solution lacks interpretation
No unifying theory

They work!
Hardware implementation is possible
Modular structure

13
Vapnik-Chervonenkis dimension(VC-dimension)

The VC-dimension of a set of indicator functions
Q(z,a) is equal to the largest number h of
vectors z1,,zN that can be separated in all the
2h using this set of functions

The VC-dimension is a scalar value, which
measures the capacity of a set of functions
For certain sets the upper bound of VC-dimension
can be calculated analytically
Bounded VC-dimension is a necessary condition for
ERM principle to be consistent

14
Model Selection
y
x
How to choose a set of functions properly?
15
Structural Risk Minimization (SRM)

Selecting a subset of a structure with optimal
complexity
Estimating the parameters of the model from this
subset

16
Separating Hyperplane

Linear separation is performed by a hyperplane

17
Optimal Separating Hyperplane

Assuming that margin ? exists
Optimal hyperplane maximizes margin ?
Maximizing of margin is equivalent to minimizing
the norm of w

18
Optimization problem

Optimization problem
Unconstrained problem with Lagrange multipliers

Using Kuhn-Tucker theorem conditions

19
Dual problem
Optimization problem
Separating hyperplane

This optimization problem can be solved using
standard quadratic programming methods

20
Nonseparable case

It is desirable to separate data with a minimal
number of errors
Positive slack variables ?i can be introduced in
the defining conditions of the hyperplane

21
Support Vector Machine

Vector X is mapped into a high-dimensional
feature space
After the transformation the optimal separating
hyperplane is to be built in the high dimensional
space ?

22
Support Vector Machine (continued)
(dual form)

If Mercer condition is satisfied, then inner
product in the Hilbert space has representation

Optimization problem
23
Kernels

The mapping into ? can be represented by kernel
function
With proper selection of K dot product can be
calculated in the low dimensional input space

24
?-insensitive Loss Function
?-insensitive loss function

Provides the best approximation for the worst
possible noise density
Robust regression under more relaxed assumptions
symmetric convex density of the noise

25
Support Vector Regression
Primal form
Dual form
26
Data Description

Real world data were provided by ComEd? (An
Exelon Company)
Hourly loads starting from January 1, 1999
through September 10, 2000
Training data set actual loads from January 1999
to January 2000

27
Software

SVMTorch
Collobert and Bengio IDIAP (Dalle Molle Institute
for Perceptual Artificial Intelligence),
Switzerland Unix, C
mySVM Version 2.1.1
Stefan Rüping, University of Dortmund
Unix\Windows , C
MATLAB SVM Toolbox
Steve Gunn, Department of Electronics and
Computer Science University of Southampton,
United Kingdom, Unix, Matlab

28
ANN forecast

ANN with 32 neurons in the hidden layer was
designed and tested

29
SVM model

Radial Kernel
Historical load data
Information about the day of the week

30
SVM forecast
31
Parameter ?

Rigorous choice of epsilon still an open issue

32
Parameter C

Parameter C controls the VC-dimension of the
learning machine
SRM can be employed on a set of functions defined
by parameters C

33
SRM principle implementation
34
Comparison of models performance
35
Comparison of models performance
36
Results

Quadratic optimization is in the core of the
method final result is unique
Tractable solution support vectors are the
crucial training points
All useful information contained in the data set
is summarized by support vectors
Solid and general theoretical foundation
Installed capabilities for model selection

37
SVMs shortcomings

Computationally slower than neural networks
Require to choose regularization parameters
Uncertainty in the choice of kernel

38
Findings

The SVM load forecasting model was designed and
tested
Structural Risk Minimization was used in model
selection
Approaches to the choice of regularization
parameters were proposed and tested

39
Challenging Issues

Design of a SVM model for higher embedding
dimension
The choice of a kernel function
Computational expense
Design of SVM load forecasting model for full set
of input parameters

40
Conclusion

The method of structural risk minimization
provides a powerful procedure for the learning
machine design
SVM is a promising
nonlinear regression technique
The notion of VC-dimension is elegant,
theoretically solid, and constructive
An application of SV method gives promising
results

Write a Comment

User Comments (0)