Title: VO Standards and Protocols
1- VO Standards and Protocols
- XMLVOTableUCDConeSearch
-
- Roy WilliamsCalifornia Institute of Technology
- NVO co-director
2XML Structured Information
ltFromgtAntonio Stadivariuslt/Fromgt ltTogtDomenico
Scarlattilt/Togt ltDategt ltDaygt13lt/Daygt ltMonthgt4lt/Mo
nthgt ltYeargt1723lt/Yeargt lt/Dategt ltBodygt Io bisogno
una appartamento acoglienti a Cremona lt/Bodygt
Separation of structure from presentation
4/13/23 April 13, 1723 17.iv.1723
The computer can read the document and answer
queries like this Find all memos from April
1723
3XML
- Documents and data
- Human readable, editable, mailable
- Schema constrains structure
- -- can encode data models
- Can be transformed (XSLT)
- -- other xml
- -- html/pdf/excel etc
- Tools
- Parsers in Java, C, C, Perl, Python, ...
- Browsers and editors
- XML databases
- Binding to make API
- For serialization, mediation, brokers
4XML for science
XML is a comfortable vehicle for our metadata and
data models But the real challenge is
To define NVO-specific data objects And how they
are used
We need consensus more than either software or
hardware
VOTable VOResource services -- WSDL
5XML example(no schema)
lt?xml version"1.0"?gt ltBookCataloguegt ltBookgt ltT
itlegtThe Cambridge Star Atlaslt/Titlegt ltAuthorgtWi
l Tirionlt/Authorgt ltISBNgt0-52156-098-5lt/ISBNgt lt
PublishergtCambridge UPlt/Publishergt lt/Bookgt ltBook
gt ltTitlegt Parallel Computing Works!lt/Titlegt ltA
uthorgtGeoffrey C. Foxlt/Authorgt ltAuthorgtRoy D.
Williamslt/Authorgt ltAuthorgtPaul C.
Messinalt/Authorgt ltISBNgt1-55860-253-4lt/ISBNgt ltP
ublishergtMorgan Kaufmannlt/Publishergt lt/Bookgt lt/Bo
okCataloguegt
6XML Parsing
SAX Event-Based Handlers functions for
StartElement, Text, EndElement, etc.
Found element BookCatalogue Found element
Book Found Element Title Found Text The Cambridge
Star Atlas Found End Element Title .
7Parsing
DOM Document Object Model Returns a tree-like
Document object with data attached
BookCatalogue
Book
Book
Title
Title
Author
Cambridge Star Atlas
ISBN
Parallel Computing Works!
Wil Tirion
8XML Schema
lt?xml version"1.0"?gt ltschema xmlns"http//www.w3
.org/2000/10/XMLSchema" xmlnscat"uri//BookCata
logue"gt ltelement name"BookCatalogue"gt
ltcomplexTypegt ltsequencegt ltelement
ref"catBook" minOccurs"0" maxOccurs"unbounded"
/gt lt/sequencegt
lt/complexTypegt lt/elementgt ltelement
name"Book"gt ltcomplexTypegt
ltsequencegt ltelement
ref"catTitle" minOccurs"1"
maxOccurs"1"/gt ltelement
ref"catAuthor" minOccurs"1"/gt
ltelement ref"catDate" minOccurs0"
maxOccurs"1"/gt ltelement
ref"catISBN" minOccurs"1"
maxOccurs"1"/gt ltelement
ref"catPublisher" minOccurs"1"
maxOccurs"1"/gt lt/sequencegt
lt/complexTypegt lt/elementgt ltelement
name"Title" type"string"/gt ltelement
name"Author" type"string"/gt ltelement
name"Date" type"string"/gt ltelement
name"ISBN" type"string"/gt ltelement
name"Publisher" type"string"/gt lt/schemagt
Book.xsd Xml-Schema Definition
9XSchema
lt?xml version"1.0"?gt ltschema xmlns"http//www.w3
.org/2000/10/XMLSchema" xmlnscat"uri//BookCata
logue"gt ltelement name"BookCatalogue"gt
ltcomplexTypegt ltsequencegt ltelement
ref"catBook" minOccurs"0" maxOccurs"unbounded"
/gt lt/sequencegt
lt/complexTypegt lt/elementgt ltelement
name"Book"gt ltcomplexTypegt
ltsequencegt ltelement
ref"catTitle" minOccurs"1" maxOccurs"1"/gt
ltelement ref"catAuthor"
minOccurs"1"/gt ltelement
ref"catDate" minOccurs0" maxOccurs"1"/gt
ltelement ref"catISBN" minOccurs"1"
maxOccurs"1"/gt ltelement
ref"catPublisher" minOccurs"1"
maxOccurs"1"/gt lt/sequencegt
lt/complexTypegt lt/elementgt ltelement
name"Title" type"string"/gt ltelement
name"Author" type"string"/gt ltelement
name"Date" type"string"/gt ltelement
name"ISBN" type"string"/gt ltelement
name"Publisher" type"string"/gt lt/schemagt
All XML schemas have schema as the root element
Book.xsd Xml-Schema Definition
10XSchema
lt?xml version"1.0"?gt ltschema xmlns"http//www.w3
.org/2000/10/XMLSchema" xmlnscat"uri//BookCata
logue"gt ltelement name"BookCatalogue"gt
ltcomplexTypegt ltsequencegt ltelement
ref"catBook" minOccurs"0" maxOccurs"unbounded"
/gt lt/sequencegt
ltannotationgtCatalog is a sequence of
bookslt/Annotationgt lt/complexTypegt
lt/elementgt ltelement name"Book"gt
ltcomplexTypegt ltsequencegt
ltelement ref"catTitle" minOccurs"1"
maxOccurs"1"/gt ltelement
ref"catAuthor" minOccurs"1"/gt
ltelement ref"catDate" minOccurs0"
maxOccurs"1"/gt ltelement
ref"catISBN" minOccurs"1" maxOccurs"1"/gt
ltelement ref"catPublisher"
minOccurs"1" maxOccurs"1"/gt
lt/sequencegt lt/complexTypegt
lt/elementgt ltelement name"Title"
type"string"/gt ltelement name"Author"
type"string"/gt ltelement name"Date"
type"string"/gt ltelement name"ISBN"
type"string"/gt ltelement name"Publisher"
type"string"/gt lt/schemagt
Default Namespace declaration all these come
from this standard namespace
11XSchema
lt?xml version"1.0"?gt ltschema xmlns"http//www.w3
.org/2000/10/XMLSchema" xmlnscat"uri//BookCata
logue"gt ltelement name"BookCatalogue"gt
ltcomplexTypegt ltsequencegt ltelement
ref"catBook" minOccurs"0" maxOccurs"unbounded"
/gt lt/sequencegt
lt/complexTypegt lt/elementgt ltelement
name"Book"gt ltcomplexTypegt
ltsequencegt ltelement
ref"catTitle" minOccurs"1" maxOccurs"1"/gt
ltelement ref"catAuthor"
minOccurs"1"/gt ltelement
ref"catDate" minOccurs0" maxOccurs"1"/gt
ltelement ref"catISBN" minOccurs"1"
maxOccurs"1"/gt ltelement
ref"catPublisher" minOccurs"1"
maxOccurs"1"/gt lt/sequencegt
lt/complexTypegt lt/elementgt ltelement
name"Title" type"string"/gt ltelement
name"Author" type"string"/gt ltelement
name"Date" type"string"/gt ltelement
name"ISBN" type"string"/gt ltelement
name"Publisher" type"string"/gt lt/schemagt
This namespace is defined here abbreviated as
"cat"
This element comes from the namespace called cat
Book element defined here
Book.xsd Xml-Schema Definition
12Namespace Content
Here
uri//BookCatalogue can be abbreviated as "cat"
The cat namespace contains
BookCatalogue Book Title Author ISBN Date Publishe
r
13XML example(with schema)
Here is the namespace that we are using in this
document
lt?xml version"1.0"?gt ltBookCatalogue xmlns
"uri//BookCatalogue" xmlnsxsi"http//www.w3.or
g/2001/XMLSchema-instance" xsischemaLocation
"uri//BookCatalogue
http//www.mydomain.com/schemas/bookcatalog.xsd"gt
gt ltBookgt ltTitlegtThe Cambridge Star
Atlaslt/Titlegt ltAuthorgtWil Tirionlt/Authorgt ltISB
Ngt0-52156-098-5lt/ISBNgt ltPublishergtCambridge
UPlt/Publishergt lt/Bookgt ltBookgt ltTitlegt
Parallel Computing Works!lt/Titlegt ltAuthorgtGeoffr
ey C. Foxlt/Authorgt ltAuthorgtRoy D.
Williamslt/Authorgt ltAuthorgtPaul C.
Messinalt/Authorgt ltISBNgt1-55860-253-4lt/ISBNgt ltP
ublishergtMorgan Kaufmannlt/Publishergt lt/Bookgt lt/Bo
okCataloguegt
Document is instance of a w3c schema
Here is the URL of its schema
14VOTable
- Full metadata representation
- Hierarchy of RESOURCEs
- containing PARAMs and TABLEs
- UCD (unified content descriptor)
- a has unit meter
- a has UCD ORBIT_SIZE_SMAJ (Semi-major axis of
the orbit ) - Can reference remote and/or binary streams
- Table can be
- Pure XML
- "Simple Binary"
- FITS Binary Table
15Sample VOTable
lt?xml version"1.0"?gt lt!DOCTYPE VOTABLE SYSTEM
"http//us-vo.org/xml/VOTable.dtd"gt ltVOTABLE
version"1.0"gt ltDEFINITIONSgt ltCOOSYS
ID"myJ2000" equinox"2000." epoch"2000."
system"eq_FK5"/gt lt/DEFINITIONSgt ltRESOURCEgt
ltPARAM name"Observer" datatype"char"
arraysize"" value"William Herschel"gt
ltDESCRIPTIONgtThis parameter is designed to store
the observer's name lt/DESCRIPTIONgt
lt/PARAMgt ltTABLE name"Stars"gt
ltDESCRIPTIONgtSome bright starslt/DESCRIPTIONgt
ltFIELD name"Star-Name" ucd"ID_MAIN"
datatype"char" arraysize"10"/gt ltFIELD
name"RA" ucd"POS_EQ_RA" ref"myJ2000"
unit"deg" datatype"float"
precision"F3" width"7"/gt ltFIELD
name"Dec" ucd"POS_EQ_DEC" ref"myJ2000"
unit"deg" datatype"float"
precision"F3" width"7"/gt ltFIELD
name"Counts" ucd"NUMBER" datatype"int"
arraysize"2x3x"/gt ltDATAgt
ltTABLEDATAgt ltTRgt
ltTDgtProcyonlt/TDgtltTDgt114.827lt/TDgtltTDgt5.227lt/TDgt
ltTDgt4 5 3 4 3 2 1 2 3 3 5 6lt/TDgt
lt/TRgt ltTRgt ltTDgtVegalt/TDgtltTDgt279.
234lt/TDgt ltTDgt38.782lt/TDgtltTDgt8 7 8 6 8
6lt/TDgt lt/TRgt lt/TABLEDATAgt
lt/DATAgt lt/TABLEgt lt/RESOURCEgt lt/VOTABLEgt
16Table Cell
follows FITS binary table does NOT follow XML
schema
boolean bit unsignedByte short int long char unico
deChar float double floatComplex doubleComplex
scalar
Primitives
arrays
variable length arrays
etc
17VOTable is Flexy
- eg Table of images
- UCD"meta.code.mime image.jpeg"
datatype"unsignedByte" arraysize"" - eg Table of URL links
- UCDmeta.ref.url"datatype"char" arraysize""
18VOTable Schema (xsd)
19Table Data Model
- Metadata
- Class definition for Row
- FIELD
- data type
- semantic type
- Data
- Each Row is a list of Cells
- Each Cell is an array of Primitives
- may be variable length
20Table Data Layout
- All metadata first
- small, complex, XML
- Class definition for table record
- params, description, etc etc
- Then data
- (may be) large, remote
- XML binary FITS
- Instantiations of table record
- All records MUST have same format
- binary data allows streaming, parallelism
21Param Data Model
- Param is Table with one cell
- Like a FIELD value
- But with a value attribute
22Primitives
- All have fixed binary length
- Same as FITS primitives
- Except Unicode
23Multidimensional Array Cell
- A table cell can have lots of Primitives
- Example WCS parameters are arrays
- ltFIELD nameCRVAL datatypedouble arraysize
2/gt - Example up to 10 images, each 64x64
- ltFIELD name"thumbs" datatype"unsignedByte"
arraysize"64x64x10"/gt
24Hierarchy
- A VOTable contains RESOURCES
- RESOURCE can contain
- TABLE
- RESOURCE
- etc etc
- Usage example
- Many observations in the file,
- each is a RESOURCE
- Each observation is
- Parameters
- Calibration table
- Raw data table
25Hierarchy
ltTABLE nameNutation and Aberrationgt ltGROUP
nameNutationgt ltFIELD
nameLongitude/gt ltFIELD
nameObliquity/gt lt/GROUPgt ltGROUP
nameAberrationgt ltGROUP nameEquinox
1950.0gt ltFIELD nameC/gt
ltFIELD nameD/gt lt/GROUPgt
ltGROUP nameEquinox 1955.0gt ltFIELD
nameC/gt ltFIELD nameD/gt
lt/GROUPgt lt/GROUPgt lt/TABLEgt
26Astronomical Data
- Image
- Standard file format FITS
- Standardized c.1980
- Keyword-value dictionary binary block
- Catalog
- Derived from image
- Connected set of bright pixels
- Table of stars
- Standard format VOTable
- Standardized 2002
- XML with remote binary
- Spectrum
27XSLT Example
ltVOTABLE version"1.0"gt ltDESCRIPTIONgtOutput
from the messier catalog at VirtualSky.orglt/DESCRI
PTIONgt ltRESOURCE type"results"gt ltPARAM
ID"RA" datatype"E" value"200.0" /gt ltPARAM
ID"DE" datatype"E" value"40.0" /gt ltPARAM
ID"SR" datatype"E" value"30.0" /gt ltPARAM
ID"PositionalError" datatype"E" value"0.1" /gt
ltPARAM ID"Credit" datatype"A"
arraysize"" value"Charles Messier, Richard
Gelderman" /gt ltTABLEgt
ltDESCRIPTIONgtOutput from messier Catalog
Serverlt/DESCRIPTIONgt ltFIELD ID"I"
name"Messier Number" datatype"char"
arraysize"" ucd"ID_MAIN"gt
ltDESCRIPTIONgtMessier Numberlt/DESCRIPTIONgt
lt/FIELDgt ltFIELD ID"RA" name"Right
Ascension" datatype"float" unit"degrees"
ucd"POS_EQ_RA_MAIN"gt ltDESCRIPTIONgtRight
Ascension J2000lt/DESCRIPTIONgt
lt/FIELDgt .... ltDATAgt ltTABLEDATAgt
ltTRgt ltTDgt3lt/TDgt ltTDgt205.5lt/TDgt
ltTDgt28.402lt/TDgt ltTD /gt
ltTDgt16.2'lt/TDgt ltTDgt6.4004lt/TDgt ltTDgtGlobular
Clusterlt/TDgt ltTDgtCanes Venaticilt/TDgt
ltTDgtM3 is one of more heavily studied globular
clusters due to its position in the galaxy,
putting it far from interstellar absorbtion. More
than 200 variable stars have been observed out of
a total of near 50,000. Being one of the
brightest clusters, M3 islt/TDgt lt/TRgt
28XSLT Result
this table is the result of a conesearch
29XSLT Program
lth2gtDatalt/h2gt lttable border"1"gt ltxslfor-each
select"FIELD"gt lttdgtltbgtltxslvalue-of
select"_at_name" /gt lt/bgtlt/tdgt lt/xslfor-eachgt
ltxslfor-each select"DATA"gt ltxslfor-each
select"TABLEDATA"gt ltxslfor-each
select"TR"gt lttrgt ltxslfor-each
select"TD"gt lttd width"100"gtltxslvalue-of
select"." /gtlt/tdgt lt/xslfor-eachgt
lt/trgt lt/xslfor-eachgt lt/xslfor-eachgt
lt/xslfor-eachgt lt/tablegt
30Binding to make a Parser
From the Schema an API and library is
generated JAXB Breeze Castor
This is JAVOT (Caltech)
for(int i0 ilttable.getFieldCount() i)
Field field (Field)table.getFieldAt(i)
String u field.getUcd() if(u !
null u.equals("POS_EQ_RA_MAIN"))
System.out.println("Field " i " is for RA")
31Unified Content Descriptor
- UCD is a semantic type
- phot.magem.opt.B Integrated total blue
magnitude - src.orbital.eccentricity Orbital eccentricity
- stat.median Statistics Median Value
- Base Specifiers
- eg error in default right ascension
- stat.error pos.eq.ra meta.main
- First word is "type"
- "what kind of thing is this?"
- How do we add a stat.error to another?
32Unified Content Descriptor
- UCD has services
- Natural Language Description
- Find best UCD
- Search in NLD
- Matching functions
- if I want pos.eq.ra, is stat.errorpos.eq.ra
correct? - What about Ontology???
33Some UCD
S stat Statistical parameters Q stat.Fourier
Fourier coefficient Q stat.Fourier.amplitude
Amplitude Fourier coefficient P stat.covariance
Covariance between two parameters P stat.error
Statistical error P stat.error.sys Systematic
error Q stat.fit Fit Q stat.fit.chi2 Chi2 Q
stat.fit.dof Degrees of freedom Q
stat.fit.goodness Goodness or significance of
fit Q stat.fit.omc Observed minus computed Q
stat.fit.param Parameter of fit Q
stat.fit.residual Residual fit Q stat.likelihood
Likelihood S stat.max Maximum or upper
limit S stat.mean Mean, average value S
stat.median Median value S stat.min Minimum
or lowest limit
34Some UCD
S phot Photometry Q phot.calib Photometric
calibration Q phot.color Color index or
magnitude difference Q phot.color.Cous Color
index in Cousins system Q phot.color.Gen Color
index in Geneva system Q phot.color.Gunn Color
index in Gunn system Q phot.color.JHN Color
index in Johnson 65 system S meta Metadata P
meta.bib Bibliographic reference P
meta.bib.author Author name P meta.bib.bibcode
Bibcode P meta.bib.ivo IVOA identifier
ivo// P meta.bib.fig Figure in a paper P
meta.bib.journal Journal name P meta.bib.page
Page number P meta.bib.volume Volume number P
meta.code Code or flag P meta.code.class
Classification code
35Cone Search
- First VO standard service
- Input RA, DEC, SR must be present
- decimal degrees J2000
- Output VOTable of sky-located data records
- must have columns with UCDsPOS_EQ_RA_MAIN,
POS_EQ_DEC_MAIN, ID_MAIN
RA300 DEC25 SR0.1
Response
Request
36Cone Searches in a VO Registry
37Result of Cone Search
RA Dec ID
38Cone Search Density Probe
Federation of Multiple Services
baseURL Spacing Search radius
Density Probe
interoperating NVO-compliant services!
Cone Search