Title: The Research on Algorithms of Estimating Photometric Redshifts Using
1Chinese Virtual Observatory
The Research on Algorithms of Estimating
Photometric Redshifts Using SDSS Galaxy Data
Wang Dan China-VO group
2Outline
- Background
- Various algorithms
- Comparison
- Summary
3Background
- The redshift of a galaxy is measured
spectroscopically - For those large and faint sets of galaxies,
spectra of galaxies are not quick and easy to
obtain - Photometric redshift technique concentrates on
medium- or broadband color features - Photometric redshifts have been regarded as an
efficient and effective measure for studying the
statistical properties of galaxies and their
evolution.
4Methods
- Template fitting approach
- Real observation (CWW)
- Population synthesis models (Bruzual Charlot)
- Training set approach
- Artificial Neural Networks (ANNs)
- Support Vector Machines ( SVMs)
- Multivariable Polynomial Regression (MPR)
- Color-Magnitude-Redshift Relation (CMR)
- Nonparametric Regression
5Hyperz
where Fobs,i, Ftemp,i and si are the observed and
template fluxes and their uncertainty in filter
i, respectively, and b is a normalization
constant. Do not reply on having any
spectroscopic redshifts, need only a few
templates.
6ANNs
ANN topology
Output Layer
Hidden Layer
Input Layer
7SVMs
Principal????????????,???????????????????????????
?????????????????,??????????
8(No Transcript)
9MPR
- Generate logical relationships between several
independent variables and a dependent variable - Training set containing the values of the
independent and dependent variables - MPR performs the regression and presents the
result as a mathematical expression - The more complete and representative training
data we provide, the more accurate the estimate
of redshifts will be - Easy to communicate with astronomers.
10(No Transcript)
11CMR
- R-magnitude has been divided into 7 subsections
- Build CMRI and CMRII for each sub-sample, CMRI is
for matrices of u- g- r, and CMRII is for
matrices of g- r- i - CMRI and CMRII have been separated into 400
400 bins. - Compute the median redshift if the number of
galaxies exceeds 25 - Achieve a color-redshift matrix, and compute the
redshifts from the matrices
12(No Transcript)
13Nonparametric Regression
- No (or very little) a priori knowledge
- Selecting an appropriate bandwidth (smoothing
parameter) is a key part of nonparametric
regression fitting -
Where c is training sample, ci is the test
sample. h is the bandwidth.
14Selection of the Bandwidth
15 Bandwidth versus redshift
16Accuracies of Different Methods
- CWW 0.0666
- Bruzual - Charlot 0.0552
- ANNs 0.0229
- SVMs 0.027
- CMR 0.032
- Nonparametric Regression 0.0236
- MPR 0.0256
17Summary
- Empirical photometric redshift estimators do rely
on the existence of a sufficiently large and
representative training set - Difficulty in extrapolating to regions that are
not well sampled by the training data. - Well suited to problems that require the redshift
distribution rather than accurate redshift of
individual galaxy
18 Prospect
- With the large and deep sky survey projects
carried out, more large and representive samples
will be obtained. - The development of new statistical analysis
algorithms. - Feature selection/extraction while data
reprocessing - More ensemble algorithms (e.g. least-square
SVMs).