Statistical CrossMatching Across Distributed Archives - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Statistical CrossMatching Across Distributed Archives

Description:

Querying and cross-matching requires metadata about catalogues & archives ... matching is non-unique. input: 67 sources. output: almost 500 match candidates ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 23
Provided by: Christoph644
Category:

less

Transcript and Presenter's Notes

Title: Statistical CrossMatching Across Distributed Archives


1
Statistical Cross-Matching Across Distributed
Archives
  • H.-M. Adorf GAVO Team
  • MPI f. extraterrestrische Physik
  • adorf_at_mpe.mpg.de

2
Statistical cross-matching
  • Cross-matching of astrometric and photometric
    catalogues
  • core functionality of a virtual observatory
  • Operational modes
  • on an area of the sky
  • using an input catalogue (GAVO matcher)

3
Philosophy
  • Build a cross-matcher application that
  • should be usable by scientists and help producing
    science results
  • uses whats there and what works now
  • doesnt get stopped by a missing standard
  • Support the VO process by
  • helping to generate appropriate VO-standards
  • adopting new VO-standards whenever feasible

4
Querying remote archives
  • Movie

5
Querying remote archives
  • Movie
  • Using up to 10 servers
  • distributed around the world
  • operating in parallel
  • Sneak preview of grid computing
  • Locally specify your tasks
  • Execute them remotely at the data centers
  • Receive results locally for final combination

6
Software demo (1)
  • Input list
  • 67 galaxies from FIRST radio catalogue
  • Query
  • 2 remote archives SDSS, VizieR
  • 20 catalogues radio, infrared, optical, X-ray
  • Task
  • get counterparts for each input coordinate
  • gather counterparts to form reasonable matches

7
The matching problem (1)

8
The matching problem (2)
9
Matcher workflow
10
Metadata
  • Querying and cross-matching requires metadata
    about catalogues archives
  • astrometric fields and associated uncertainties
  • photometric fields and associated uncertainties
  • some metadata
  • are locally generated and stored
  • are retrieved from archives in real-time

11
Software demo (2)
  • Issue false alarms
  • matching is non-unique
  • input 67 sources
  • output almost 500 match candidates
  • many of these match candidates are false alarms

12
Issue false alarms (3)
  • Two fundamental, independent probabilities
  • Hit probability p(cC)
  • False alarm probability p(cnot C)
  • Goal
  • keep the hit probability high (completeness)
  • while keeping the false alarm probability low
  • goodness depends on S/N ratio in the data

13
Issue false alarms (4)
  • Solution use statistics (fuzzy matching)
  • compute statistical (Mahalanobis) distance
    between counterparts and center position
  • Compute reliability measure for match candidate
    (reduced chi-squared)

14
Software demo (3)
  • Lower reduced chi-squared from 10,000 to 3

15
Software demo (3)
  • Lower reduced chi-squared from 10,000 to 3
  • Result
  • Hit-rate is still pretty high
  • False-alarm rate is dramatically reduced

16
Issue server reliability
  • An archive server
  • may be down (easy to detect)
  • may be slow today (more difficult to detect)
  • may deliver wrong results (spoils the science)

17
VO Standards
  • Status
  • Input
  • CSV files for data
  • XML files for query match process description
  • Sending plain HTTP/HTML to archive servers
  • Receiving
  • CSV file from SDSS SkyServer
  • VOTable from VizieR (VO-Std)
  • Output
  • VOTable with complete match result (VO-Std) -
    VOPlot
  • various CSV files

18
Software demo (4)
  • VOPlot

19
Plans Ideas
  • GUI for newcomers
  • Facilitates selection of catalogues, astrometric
    photometric columns, etc.
  • Generates configuration file
  • for query including server selection
  • for core cross-matcher, including chi-squared
    limit
  • Automatic monitoring of server response and
    reliability
  • Improved matching algorithm
  • GUI panel for match candidate visualization

20
Summary
  • Shown a working cross-matcher application
  • Operates with distributed archives queried in
    parallel
  • Demonstrated that
  • fuzzy matching is needed
  • reduced chi-squared is a powerful statistical
    discriminator
  • High hit-probability, low false-alarm probability
  • GAVO cross-matcher currently being used in a
    first science application

21
Thanks
  • Particularly to the folks
  • from SkyServer/SDSS, and
  • from VizieR _at_ CDS and _at_ mirror sites,
  • who, with their services, have enabled the
    cross-matcher

22
The end
Write a Comment
User Comments (0)
About PowerShow.com