SDSS Data Release 6 Access to DR6 and SEGUE Catalog Data - PowerPoint PPT Presentation

About This Presentation
Title:

SDSS Data Release 6 Access to DR6 and SEGUE Catalog Data

Description:

Title: PowerPoint Presentation Author: Krista Wildt Last modified by: thakar Created Date: 8/8/2003 5:14:50 PM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 29
Provided by: Krista45
Learn more at: http://www.sdss.jhu.edu
Category:
Tags: sdss | segue | access | catalog | data | dr6 | joins | release

less

Transcript and Presenter's Notes

Title: SDSS Data Release 6 Access to DR6 and SEGUE Catalog Data


1
SDSS Data Release 6Access to DR6 and SEGUE
Catalog Data
  • Ani Thakar
  • Alex Szalay, Maria Nieto-Santisteban, Nolan Li,
  • Wil OMullane, Adrian Pope, Tamas Budavari,
    George Fekete,
  • Jordan Raddick,Sam Carliles
  • JHU
  • Brian Yanny, Svetlana Lebedeva
  • FermiLab
  • Jim Gray
  • Microsoft Research

2
Outline
  • SDSS and Data Overview
  • SDSS-II and DR6
  • CAS Data Access
  • SkyServer, ImgCutout, CasJobs
  • Help resources and sample queries
  • Restricted collab access
  • SDSS and other datasets
  • VO services
  • EPO content

3
SDSS
  • Digital map in 5 spectral bands covering ¼ of the
    sky
  • 40 TB of raw pixel data
  • Photometric catalog with more than 200 million
    objects
  • Spectra of 1 million objects
  • Data Release 5 (DR5) last public release 240 M
    images, 740 k spectra

Apache Point Observatory, NM
  • JHU contributions
  • Multi-Fiber Spectrograph
  • 20 Photometric Telescope
  • Catalog Archive Server DBMS
  • All data is served from FermiLab (master archive
    site)
  • SDSS-II is the continuation of SDSS through 2008

4
SDSS Data Overview
  • Data Archive Server (DAS)
  • FITS files (raw data)
  • Images, spectra, corrected frames, atlas images,
    binned images, masks
  • Online form-based access
  • Rsync and wget file retrieval

Catalog Archive Server (CAS) Science parameters
extracted to catalogs Stuffed into relational
DBMS (SQL Server) Heavily indexed,
optimized Online access via SkyServer Several
levels of access, query tools
SDSS Data Release
cas.sdss.org skyserver.sdss.org
www.sdss.org das.sdss.org/DRx-cgi-bin/DAS
5
SDSS Data Releases
5, 200GB
20, 1TB
35, 2TB
52, 3TB
66, 3.8TB
80, 4.5TB
(DR6)
EDR
DR2
DR3
DR4
DR5
DR1
Jan 2004
Jan 2005
Jan 2001
Jan 2002
Jan 2003
Jan 2006
Jan 2007
Jun
Jun
Jun
Jun
Jun
Jun
Rel Date CAS Size Images Spectra Sq Deg CAS Mirrors
(DR6) 2/09/07 5 TB 300 M 885k 8520 ---
DR5 6/28/06 4.5 TB 240 M 740k 8000 JHU,Portsmouth,STScI, Moscow, UIC
DR4 6/29/05 3.8 TB 180 M 608k 6670 JHU, India, UIC
DR3 9/27/04 3 TB 141M 478k 5200 JHU,India,Portsmouth,UIC worldwide distribution
DR2 4/15/04 2 TB 88M 330k 3324 JHU, UPitt, SDSC, Germany
DR1 6/15/03 1 TB 53M 186k 2099 JHU, SDSC, CDAC, UPitt UK, Germany, Japan, India
EDR 6/06/01 200 GB 14M 54k 462 JHU, SDSC, UK (ROE), Japan
6
SDSS-II
  • Legacy
  • Continuation of SDSS-I (fill out 10k sq.deg.)
  • Completeness is same as for SDSS-I
  • Flux limits are the same
  • Target all galaxies with r_petro lt 17.77, plus
    LRGs
  • SEGUE
  • Detailed 3-d map of the Galaxy
  • Spectra of 240,000 stars in the disk and spheroid
  • Age, composition and phase space distribution
  • CAS component cataloged in SegueDRx DB
  • Supernova Survey
  • Repeated scans of SDSS Southern Stripe over 3
    mths/yr
  • Data not available in CAS yet, will be on DAS soon

7
Publication Policy
  • Click on Collaboration link on www.sdss.org
  • Scroll to bottom
  • Click on SDSS Publication Procedures
  • Proprietary data
  • Announce project to SDSS Projects Page
  • Add SDSS credits/acknowledgements to papers
  • Reference SDSS Technical Papers
  • Post manuscript to SDSS Publications Page
  • External Collaborators and Participants
  • Post requests to sdss-coco and sdss-general
    mailing lists
  • Ask your local CoCo rep (he wont bite)

8
CAS Datasets
  • BestDRx
  • Latest, greatest calibration of the data
  • Photometric and spectroscopic objects
  • The default and most accessed (by far) dataset
  • TargDRx
  • The calibration from which spectroscopic targets
    were chosen
  • RUNS
  • All the runs (processings) other than Best,
    Target
  • SegueDRx
  • New with DR6 (SDSS-II)

9
CAS Data Model (Best DB)
10
DR6 CAS
  • BestDR6, TargDR6 and SegueDR6 databases
  • SegueDR6 SEGUE stripes
  • May be rolled into BestDRx in the future
  • SkyServer http//cas.sdss.org/collabdr6
  • Also pwd access at http//cas.sdss.org/collabdr6pw
    for non-collab IPs
  • See message sdss-archive/2935 for uname/pwd
  • CasJobs
  • DR6 and DR6QA targets point to BestDR6 DB
  • TARGDR6, SEGUEDR6 targets
  • Need to have collab privilege set in user
    profile

11
CAS Data Access
  • SkyServer
  • Web browser-based synchronous access
  • Meant to support several levels of users
  • From casual to moderately advanced queries
  • From simple form-based to direct SQL queries
  • From cone (radial) search to crossid type
    searches
  • Visual tools to browse image and catalog data
  • API access, e.g. emacs interface, sqlcl
    (command-line)
  • Strict limits on execution time and output size
  • Fair use for everyone, robots/crawlers
    discouraged
  • ImgCutout
  • Finding Chart and JPEG image browser
  • Accessible from SkyServer (Visual Tools)

12
CasJobs
  • Link in SkyServer (http//cas.sdss.org/casjobs)
  • Batch Query Workbench, personal user DB (MyDB)
  • Quick mode 1 minute cutoff
  • Submit mode up to 8 hours in long queue
  • 24-hr queue for collab members
  • Preferred method for serious queries
  • MyDB database to save results of your queries
  • Define your own functions, procedures too
  • Share your tables with collaborators (groups)
  • Job history, plotting, FITS/CSV/VOTable output
  • Table Import (upload) for your own data
  • Groups to share your results with collaborators
  • Command-line access Java tool also downloadable

13
Using CasJobs
  • Every query has a default target
  • The database that it will operate on
  • e.g., MyDB, DR4, DR5, CollabDR4, DR5QA, BESTRUNS
    etc.
  • Each target is hosted on a separate server
  • Provides load balancing and performance
  • Some quirks/restrictions due to distributed
    execution
  • Help page and FAQ explain these
  • Ability to do distributed joins between different
    datasets
  • e.g., between DR4 and DR5 or RUNS and DR5

14
Collab-only access
  • collabdrx SkyServer sites
  • IP-restricted access to collabdrx URL
  • Password access to collabdrxpw URL from other IPs
  • Larger query limits (e.g., 1 hour/500k rows)
  • collab privilege in CasJobs
  • Gives you access to restricted data, additional
    longer queues (e.g. DR5QA, DR6QA 24-hr queues)
  • If you have collab priv set, you will see these
    queues
  • If you dont have it, email sdss-helpdesk_at_fnal.gov

15
Data available only to Collab - RUNS
  • RUNS DB
  • SkyServer http//cas.sdss.org/runs
  • Also http//cas.sdss.org/runspw for pwd access
  • CasJobs BESTRUNS context
  • Mostly SEGUE (half) runs and stripe 82 (most)
  • DRx runs still be added over next few months
  • Imaging only, no spectra
  • May be possible to link to BEST spectra with join
  • Use match tables to match up repeat observations
    in multiple runs

16
SkyServer Help Resources
  • Help menu option on top right of SkyServer
  • Start with Archive Intro
  • Next look at Query Limits and How To pages
  • Then Introduction to SQL and Sample Queries
  • Look at Optimizing Queries page (esp. bookmark
    bug)
  • Try out some of the sample queries
  • Cut and paste to SQL search page
    (Tools?Search?SQL)
  • Browse FAQ and Schema Browser
  • Data release and technical papers

17
CasJobs Help
  • Shares SQL Intro, Schema Browser with SkyServer
  • Has its own FAQ page
  • Lists differences between CasJobs and SkyServer
    due to distributed query execution
  • Advanced CasJobs Queries page
  • Neighbor searches with fixed and variable search
    radii
  • Cursors
  • Compound queries

18
Sample Queries
  • 50 sample queries from simple to complex
  • Available in SkyServer and CasJobs
  • Clean photometry meta-flags sample
  • INNER/OUTER JOIN samples
  • Sector/Region tables usage sample
  • Variability queries from Robert and Zeljko
  • CasJobs Advanced Queries Help page
  • Has examples of neighbor searches, cursors etc.

19
SkyServer General Tips
  • Use astro or collab sites
  • Less frills, more direct access to tools
  • More generous query limits (timeouts, row limits)
  • See Help?Query Limits page
  • Collab site is restricted access, largest query
    limits
  • Some extra features
  • e.g. Imaging/Spectro form query
  • Each release has separate sites
  • http//cas.sdss.org/collab/ (the public release)
  • http//cas.sdss.org/collabdr6/ (not yet public)
  • Use Contact link when emailing help-desk

20
Find the right tool for the job
  • Visual exploration Tools?Visual Tools?
  • Browse objects one at a time Explore page
  • Shows all parameters for object, also its image
    and spectrum
  • Browse and find objects on a frame Finding Chart
  • Navigate image frames Navigate
  • View multiple objects with query Image List
  • Browse images Tools?Get Images?
  • Frames Fields browser
  • Spectroscopic plates Plates browser
  • View individual spectra Spectra browser

21
Finding the right tool (contd.)
  • SQL search Tools?Search?
  • Cone (radial) search Radial search form
  • Region (rectangle) search Rectangular search
    form
  • Imaging form query Imaging Query form
  • Spectroscopic form query Spectro Query form
  • All other searches SQL search page
  • Cross-matching Tools?Object Crossid?
  • Imaging crossid Upload
  • Spectro crossid SpecList
  • Advanced,unrestricted SQL queries CasJobs
  • Your own personal DB
  • Retrieve results when you are ready

22
CAS Dos and Donts
  • Do not submit a query unless you have some idea
    how long it will take!
  • It could tie up the server for hours (sometimes
    days)!
  • Do a count query first if necessary
  • Casjobs also has a graphical query plan (Plan
    button)
  • Look at samples, query optimization pages
  • If not sure, use form queries at first
  • Use the predefined views for unique/primary
    objects
  • PhotoObj, PhotoPrimary for photometry
  • Consider using PhotoTag table if you only need
    popular fields
  • Makes better use of cache
  • SpecObj for spectra

23
Dos and Donts (contd.)
  • Use the Contact link to contact Help Desk
  • Fill the short form, which gives us necessary
    information
  • In CasJobs, press Contact after logging in
  • Automatically attaches your userid to the message
  • Will speed up response to your request
  • Do not contact Help Desk staff directly
  • Questions are answered by a pool of experts as
    available
  • More likely to get delayed or no response (unless
    you can bug them in person ?)
  • If you run out of MyDB space, ask for more!!
  • Were pretty liberal about giving more space, but
    you have to ask (to avoid empty/unused MyDBs
    taking up space)

24
SDSS and other datasets
  • GALEX
  • Has its own CasJobs page hosted by MAST
  • SDSS vs GALEX cross-matches
  • DR5 vs GR2 and DR4 vs GR2 available now
  • DR5 vs GR3 coming soon
  • Link table with IDs from both catalogs
  • SDSS parameters for GALEX matches also extracted
  • Some older datasets matched in BEST DB
  • FIRST, USNOB, ROSAT, USNOB proper motions
  • Open SkyQuery site for other datasets
  • Only small-area xmatches possible at the moment

25
Virtual Observatory
  • JHU is one of the main participants
  • SDSS is one of the drivers for NVO
  • Co-PI (Szalay) and Project Manager (Hanisch) here
  • JHU VO services
  • Open SkyQuery (http//openskyquery.net/)
  • VO services (http//voservices.org/)
  • Spectrum services
  • Filter profiles
  • Footprint services (new!)
  • VO registry (with STScI)
  • Standard VO services Cone Search, SIAP

26
Educational Resources
  • Extensive EPO content in SkyServer
  • Use Projects link on the menu at the top
  • K-12 and college level student exercises and
    teacher resources
  • Open SkyQuery / Cross-match EPO project
  • Jordan Raddicks talk on May 1

27
Coming Attractions (or not )
  • FITS cutout service
  • VO-India is developing a service
  • There are technical difficulties with mosaicing
    multiple SDSS frames
  • How to handle different S/N, PSFs across frames?
  • Non-SDSS datasets in CasJobs
  • Merging of CasJobs and Open SkyQuery features
  • Ability to do large-scale cross-matches with
    other datasets within CasJobs environment

28
Thanks!
  • http//www.sdss.org (main site and DAS)
  • http//cas.sdss.org/ (public CAS, redirected to
    latest public release)
  • http//cas.sdss.org/drX (public CAS, release X)
  • http//cas.sdss.org/astro (astronomers, latest
    pub release)
  • http//cas.sdss.org/astrodrX (astronomers,
    release X)
  • http//cas.sdss.org/collab (IP-restricted collab
    site, latest pub release)
  • http//cas.sdss.org/collabdrX (IP-restricted
    collab site, release X)
  • http//cas.sdss.org/collabpw (collab pwd access,
    latest pub release)
  • http//cas.sdss.org/collabdrXpw (collab pwd
    access, release X)
  • http//www.voservices.org/ (VO services _at_ JHU)
  • http//www.openskyquery.net/ (Open SkyQuery)
  • http//www.skyserver.org (support site)
  • Software downloads, mirror site resources, data
    download info
Write a Comment
User Comments (0)
About PowerShow.com