Predictive Horse Race Handicapping Using Neural Networks Yet Another Progress Report

About This Presentation

Title:

Predictive Horse Race Handicapping Using Neural Networks Yet Another Progress Report

Description:

Finish position and lengths for each horse, compared between the two networks' results ... Learning To Predict the results Of sporting matches, Michael Baulch, http: ... – PowerPoint PPT presentation

Number of Views:671

Avg rating:5.0/5.0

Slides: 20

Provided by: starbase

Category:

more less

Transcript and Presenter's Notes

Title: Predictive Horse Race Handicapping Using Neural Networks Yet Another Progress Report

1
Predictive Horse Race Handicapping Using Neural
NetworksYet Another Progress Report

Andrew Schurr
Advisor Ralph Morelli

2
Overall Goal

Use a neural network, trained with past
performance data, to predict future horse races.

3
Picking Horses

The outcome of a horse race is determined by many
factors
If we know what factors are important, we can
predict which horse is favored to win
This raw information is available as past
performance data

Factors
Previous race times
Post position
Jockey win/loss record
Previous stakes
Type of runner
Breeding
ect

4
with Neural Networks

By using a computerized neural network, we can
sift through a large amount of computerized
past-performance data, and identify which factors
influence a winning horse
Neural network
Interconnected nodes that fire when stimulated
Can learn through training, gradually adjusting
the level at which they fire
Good at figuring out how different variables
influence each other

5
Topology
Inputs
Output
Previous race times Post position Jockey win/loss
record Previous stakes Type of runner Breeding ect

Relative fitness of the horse
6
Current Progress
- Completed -
- Pending -

Find flexible java library for building Neural
Networks
Build prototype that mimics small-scale
handicapping problem
Locate a large set of digital past performance
data
Add small set of actual data to prototype
Clean and format full set of horse data, add to
prototype
Use genetic algorithms to determine optimal
neural network configuration
Too time consuming to be effective! Select blocks
of data based on accepted handicapping knowledge,
and combine the results
Build user interface and file loading
capabilities on top of the fully-trained network

7
Initial Trial with Limited Real Data

Input data
12 horses per race
Positions and fractional lengths for each horses
two previous races
Speed rating for each horse
Output data
Finish position and lengths for each horse
Training
21 training races
4 test races
20,000 training cycles

8
Initial Trial Performance

Training
Training error 17-18
Represents numerical error, not error in picking
winners
Prediction
0 correct winners picked
42 of all horses (1st, 2nd, 3rd place) correctly
predicted to show
but none in their proper positions

9
Target Accuracy

Odds of randomly selecting a winning horse in a
12-horse race 8
But Each trial may have as many as 12 horses or
as few as 5, padded out with null horses that
should always lose.
Randomly selecting winner from 5 horses 20
A reasonable random accuracy should reflect the
variation in race size.
About 10 to 13

10
Full Data Set Trial

Input data
12 horses per race
All available data in Race, Entry, Horse Past
Performance data files
2,749 total inputs, 228 variables per horse, 13
race-specific variables
Output data
Finish position and lengths for each horse
Training
400 training races
50 test races
2,000 training cycles (about 15 hours)

11
Full Data Set Trial Performance

Training
Training error 1,700
Unable to find any kind of solution approaching a
real answer
Prediction
Completely random
Useless!

12
Paired-down Data Set Trial

Input data
12 horses per race
2 Separate Neural Networks
Network 1 Horse Past Performance times, lengths,
finish positions, odds, and speed ratings for
past 3 races
Network 2 Horse win/loss record on various
surfaces, jockey win/loss, trainer win/loss,
various other pieces of data
Output data
Finish position and lengths for each horse,
compared between the two networks results
Training
400 training races
50 test races
200 training cycles (about 1 hour)

13
Paired-down Data Set Performance

Training
Training error - Net1 21 Net2 18
Again, relates to mathematical error, not winners
picked
Prediction
Net 1
Winners picked 11/50, 22
1st Horse Win/Place/Show 29/50, 58
Clear winners predicted 7/18, 39
Net 2
Winners picked 11/50, 22
1st Horse Win/Place/Show 27/50, 54
Clear winners predicted 6/19, 32

14
Whats a clear winner ?

The network is predicting a mathematical guess as
to what place the horse will finish in, not an
actual place number
So, a horse could be predicted to finish in
0.45765 place, not 1st place
Take three horses
Horse 1 predicted finish 0.2155 1st place
Horse 2 predicted finish 1.6020 2nd place
Horse 3 predicted finish 2.1985 3rd place
There is a large gap between first and second
place, and a much smaller gap between second and
3rd. So, we interpret this to mean that the
network believes Horse 1 will win by a large
margin, followed by horses 2 and 3 finishing very
close to each other.
Hence, Horse 1 is a clear winner

15
Combined Performance

Results taken together
Both networks agreed on correct winner
4/13, 31
Both networks agreed on correct winner, and both
predict it to be a clear winner
2/4, 50

16
Analysis

50 is pretty good, right?
But wait the combined networks produced only 4
such predictions.
4 trials is far too small a sample size to base
any kind of long-term projection on.
What if the random sequence of correct
predictions for a larger sample went like this
Right, Wrong, Wrong, Right, Wrong, Wrong, Wrong,
Wrong, Right, Wrong, Wrong, Wrong
If we looked at only the first four, our
prediction rate is 50, but if we look at a
larger sample, it falls to 25
Conclusion Its encouraging, but we need more
sample trials!

17
To Do

Incorporate more of the data, possibly in more
separate networks
Over half of the original data was culled between
the second and third trials. Some of it could
still be valuable.
Run networks on more learning epochs
Currently at 200, see if 2,000 or 20,000 training
epochs increases accuracy
Find way to test 50 combined figure on a larger
sample set.
Buy more data
Cycle the data, so that different trials are in
training and test sets

18
Questions?
19
Sources

Neural Networks for Fun and Profit, Bret Halford,
http//csel.cs.colorado.edu/cs3202/papers/Bret_Ha
lford.html
An introduction to neural networks, Andrew Blais
and David Mertz, IBM developerWorks,
http//www-106.ibm.com/developerworks/library/l-ne
ural/
Expert Prediction, Symbolic Learning, and Neural
Networks An Experiment on Greyhound Racing, H.
Chen, P. Buntin, L. She, S. Sutjahjo, C. Sommer,
D. Neely, http//ai.bpa.arizona.edu/papers/dog93/d
og93.html
Using Machine Learning To Predict the results Of
sporting matches, Michael Baulch,
http//innovexpo.itee.uq.edu.au/2001/projects/s348
234/thesis.pdf
Albrecht, http//www.teco.uni-karlsruhe.de/albrec
ht/neuro/html/
Handicappers Daily, ITS Inc., http//www.itsdata.
com/
Betting Thoroughbreds, Steven Davidowitz, First
Plume Printing, 1997

Write a Comment

User Comments (0)

About PowerShow.com

Predictive Horse Race Handicapping Using Neural Networks Yet Another Progress Report - PowerPoint PPT Presentation

Predictive Horse Race Handicapping Using Neural Networks Yet Another Progress Report

Finish position and lengths for each horse, compared between the two networks' results ... Learning To Predict the results Of sporting matches, Michael Baulch, http: ... – PowerPoint PPT presentation