Cosmological simulations in a relational database: modelling and storing merger trees PowerPoint PPT Presentation

presentation player overlay
1 / 1
About This Presentation
Transcript and Presenter's Notes

Title: Cosmological simulations in a relational database: modelling and storing merger trees


1
Cosmological simulations in a relational
database modelling and storing merger trees
  • Gerard Lemson, GAVO, Max-Planck-Institut für
    extraterrestrische Physik, Garching, Germany
  • Volker Springel, Max-Planck-Institut für
    Astrophysik, Garching, Germany

AbstractWe present a method for storing
tree-like data structures in a relational
database that allows for fast querying of
children and parents of any node and from and
down to any level. We have used this method in
storing halo merger trees derived from a large
cosmological N-body simulation and the merger
trees of model galaxy catalogues derived from the
halo catalogues using semi-analytical methods. We
give SQL queries corresponding to typical science
questions that can be asked from such a database
and present an online query interface available
through the web portal of the German
Astrophysical Virtual Observatory.
Background and goals This work was done in the
context of the German Astrophysical Virtual
Observatory (GAVO). GAVO pays special attention
to the introduction of theory data (simulations)
into the Virtual Observatory (VO). To test our
ideas we have created various prototype
implementations. Our main goal for the project
presented here was to investigate the use of
relational database technology in the analysis of
results of large scale structure simulations, as
well as in their online publication. The former
may lead to direct scientific benefits to the
owners of the data, the latter leads to benefits
to the larger community that gets access to the
data in a well defined and standardized manner.
  • Simulation and science questions
  • The simulation that was used in this prototype is
    a relatively small, dark matter,
  • cosmological N-body simulation, that was created
    as preparation for the Millennium
  • simulation 1,2. For this project we were
    interested in post-processing products
  • of this simulation density fields, halo
    catalogues including halo merger trees and
  • mock galaxy catalogues. The latter were produced
    using semi-analytical galaxy
  • formation (SAGF) routines that use the merger
    trees as input (see 3,4 for
  • descriptions of the SAGF algorithms).
  • The database was designed to answer a number of
    science questions, similar to the
  • Approach in 5. We polled astrophysicists
    associated to the simulation project, which
  • resulted in the following list which is a subset
    of these questions
  • Return the complete halo merger tree for a halo
    identified at z0
  • Find positions and velocities for all galaxies at
    redshift zero with B-luminosity, colour and
    bulge-to-disk ratio within given intervals.
  • Return B-band luminosity function of galaxies
    residing in halos of mass between 1013 and 1014
    solar masses.
  • Return the formation time of halos, defined as
    the maximum time at which it still has a
    progenitor of greater than half its mass, as
    function of the matter density in its
    environment, defined by the matter density
    smoothed on scale of 10Mpc (inspired by 6,7).

Database Implementation The design of the
information system started with the construction
of an analysis model, shown in Fig. 3a. It
contains the important concepts and their
relationships of the domain under investigation
(see 8 and references therein). Important for
this project are simulator (the code), simulation
(the running of the simulator with particular
input parameters) and its snapshots. The actual
data stored in the database are the results of
post-processing cluster extraction and galaxy
formation. In our model all of these specialize a
common pattern that in 8 is identified with the
basic concepts protocol, experiment and result.
They are especially important for describing the
provenance of the data. The physical database
model is restricted to the data part of the
conceptual model. It is more constrained in that
it must fit the data in a relational model, that
it must enable translation of the science
questions into (relatively easy) SQL and moreover
that it do so efficiently. The science questions
deal with relations between different types of
objects, between object and environment and,
especially, with the formation history of
objects. The history is embodied in the merger
trees of both halos and galaxies. One can store
trees using a single link from progenitor to
descendant, but this requires recursion to
retrieve a complete progenitor tree. This is not
a standard feature of all relational databases
and a more efficient solution is desirable even
where it is supported. Fig. 2 illustrates our
solution. Each object gets an identifier
corresponding to its order in a depth first sort
of the trees rooted in objects at the final
snapshot. Each object furthermore gets a pointer
(foreign key) to the last progenitor in the
ordering of the sub-tree rooted in that object.
The complete progenitor tree rooted in a given
object (at any snapshot !) is now precisely the
set of objects whose identifier has value between
the root objects id and the id of the last
progenitor. In SQL the relevant query is as
follows select prog. from halo des,
halo prog where des.haloId 5000063000000
-- example value and prog.haloId between
des.haloId and des.lastProgId This is the query
corresponding to science question 1 above. In the
database the tables are clustered (ordered)
according to the id columns, which ensures that
merger trees are sequentially stored on the
disks, speeding up the retrieval. One other
feature of the data model is the spatial indexing
based on the Peano-Hilbert space filling curve.
The Millennium simulations files are organized
around this index (see 9), which is a higher
dimensional equivalent to the recursive HTM 10
or HEALPix 11 indexes on the sky. In the
database it will likewise allow efficient spatial
searches, though for now it is used to link the
objects and the density field.
Webportal and example queries The database is
accessible online from a special purpose web
application accessible through the GAVO portal
(http//www.g-vo.org), which follows design ideas
from the SkyServer 12 and GalICS 13 web
applications. The user can type in free-form SQL
queries and retrieve the result in a variety of
formats HTML, CSV, VOTable (Fig 4b). A
particular feature is the ability to visualise
the results directly via a VOPlot 14 applet
(see Fig 4c). A number of example queries are
available. In Fig 5. we show the queries
corresponding to the other three science
questions above. DEMO This GAVO web application
is being demonstrated at this conference.
Fig 4 Snapshots of the GAVO portal web pages
providing access to the simulation database. (a)
shows the query page, with demo queries and links
to the schema and documentation. (b) shows the
result of the query in (a) in VOTable format. (c)
show the same result plotted with VOPlot. The
query implements science question number 1 and
the plot shows the evolution of the merger tree
below a given halo at redshift 0 by plotting the
X-position vs the snapshot number. This gives a
very nice illustration of the orbits of the halos
and their merging behaviour.
References 1 Springel V., White S.D.M., et al,
2005, Nature, 435, 629 2
http//www.mpa-garching.mpg.de/galform/virgo/mille
nnium/index.shtml 3 Croton D.J., Springel V.,
et al, 2005, MNRAS, submitted (astro-ph/0508046)
4 de Lucia G., Kauffmann G. White S.D.M,
MNRAS. 349 (2004) 1101 5 Gray J., Szalay A.,
et al, 2002. http//arxiv.org/abs/cs.DB/0202014 6
Gao L., Springel V., White S.D.M, 2005, MNRAS,
in press (astro-ph/0506510) 7 Lemson G.,
Kauffmann G, 1999, MNRAS 302, 111
8 Lemson, G., Dowler, P., Banday, A.J. 2003, in
ASP Conf. Ser., Vol. 314 Astronomical Data
Analysis Software and Systems XIII, eds. F.
Ochsenbein, M. Allen, D. Egret (San Francisco
ASP), 472 9 Springel, V., 2005, MNRAS,
submitted (astro-ph/0505010) 10
http//skyserver.org/HTM/ 11 http//healpix.jpl.
nasa.gov/ 12 http//cas.sdss.org/dr4/en/tools/se
arch/sql.asp 13 http//galics.cosmologie.fr/main
_frames.php?dirdatabase 14 http//vo.iucaa.erne
t.in/voi/voplot.htm
Write a Comment
User Comments (0)
About PowerShow.com