Carolina Environmental Program UNC Chapel Hill - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Carolina Environmental Program UNC Chapel Hill

Description:

A New Tool for Model Evaluation, Sensitivity and ... Output directory for plots and tables. This allows plots and tables to be created in an automated fashion ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 17
Provided by: aliso72
Category:

less

Transcript and Presenter's Notes

Title: Carolina Environmental Program UNC Chapel Hill


1
The Analysis Engine A New Tool for Model
Evaluation, Sensitivity and Uncertainty Analysis,
and more
  • Alison M. Eyth, Prashant P. Pai
  • Carolina Environmental Program
  • University of North Carolina at Chapel Hill
  • October 19, 2004

2
Background
  • Supports data analysis by creating plots and
    tables
  • Analysis Configurations facilitate repeated
    analyses
  • Developed as part of the Multimedia Integrated
    Modeling System (but can be used standalone)
  • Java application that runs on Windows, Linux,
  • Open source available from http//sourceforge.ne
    t/projects/mimsfw
  • Three main components
  • Table application
  • Plotting engine
  • Statistics package

3
Table Application
  • Provides the top level user interface
  • File menu accesses import and export functions
  • Currently supported file formats include
  • Comma separated (.csv), Custom and tab delimited,
    Fixed column width, SMOKE Report, and ARFF
  • Data files are imported as rows and columns
  • Each file is shown in its own tab with file name,
    header, data table, and footer
  • Toolbar and popup menus provide access to
    functions (e.g. sort, filter, format, plot,
    statistics)

4
Table Application GUI
5
Toolbar and Pop-up Menu Functions
  • Multi-column sort
  • Show rows with Top N values
  • Show tows with Bottom N values
  • Filter rows based on criteria (e.g. NOx gt 500)
  • Show / hide columns
  • Format columns (e.g. number style, color, width)
  • Create plots
  • Compute statistics
  • Edit analysis configuration
  • Reset

6
Filter Rows Dialog
  • Use Filter Rows to limit the rows shown in the
    table
  • Any number of criteria can be added
  • Each criterion has a column, operation, and value
  • Available operations are lt, lt, gt, gt, not ,
    starts with, contains, ends with, does not start
    with, does not contain, ...
  • Select between showing rows matching ALL criteria
    or ANY

7
Plotting Options Dialog
  • Choose Plot type from Bar, Box, CDF, Discrete
    Category, Histogram, Rank Order, XY, Line, Time
    Series, and Tornado
  • Select Data Columns to plot
  • Specify Units and one to three columns to use for
    labels
  • Selected data is passed to the plotting engine

8
Plot Properties are Specified using the Analysis
Engine GUI
9
Example Discrete Category Plot
Note Plots are created using a custom Java
interface to R
10
Statistics Dialog
  • Provides interface to the statistics package
  • Specify statistics to compute and data columns to
    analyze
  • Additional details are specified on other tabs
  • Statistics outputs appear as new tabs in the
    table application
  • Statistics are computed using Colt and Weka

11
Example of Histogram Statistics
12
Analysis Configuration Dialog
  • The Analysis Configuration stores all the table
    settings and plots that you have created during
    your session
  • The selected plots can be viewed, edited or
    deleted
  • Plots can be given new names by double clicking
    the name
  • Some (or all) of the settings can be saved to a
    configuration file
  • Configuration files can be loaded in future
    sessions or for other data files in the current
    session

13
Automation
  • An optional command line interface may be used
    specify
  • Data files to load
  • Analysis configuration file to use
  • Type of plots to create (e.g. JPG, PDF, PNG)
  • Output directory for plots and tables
  • This allows plots and tables to be created in an
    automated fashion
  • Standard analysis products may be created for
    newly available data sets

14
Examples of Potential Applications
  • Model Evaluation
  • Sort to find stations at which the error was the
    largest
  • Plot modeled and observed values on box plots,
    etc.
  • Create scatter plots of one species vs. another
  • Sensitivity and Uncertainty Analysis
  • Perform linear regression and show in plots and
    tables
  • Compute correlation coefficients
  • Emissions Modeling Quality Assurance
  • Find states with top 10 emission values
  • Stacked bar charts to show total emissions
  • Compute histograms
  • General Data Analysis
  • Analyze data by sorting, filtering, and computing
    statistics

15
Future Directions
  • Initial version will be released on SourceForge
    by 10/31/04 (which is the end date for the
    current funding for this work)
  • Many potential enhancements are listed on
    SourceForge, e.g.
  • Create new rows and columns using functions (e.g
    difference, sum)
  • Create plots and tables with data from multiple
    tabs
  • Will likely be used as part of the new emissions
    quality assurance tool (http//sourceforge.net/pro
    jects/emisview)
  • Mr. Tommy Cathey will continue to develop the
    custom Java interface to R at the EPA Scentific
    Visualization Laboratory in FY05

16
References
  • MIMS Sourceforge page (for downloads)
    http//sourceforge.net/projects/mimsfw
  • R (for plots) http//www.r-project.org
  • Colt (for basic statistics) http//www-itg.lbl.g
    ov/hoschek/colt
  • Weka (for regression and correlation analysis)
    http//www.cs.waikato.ac.nz/ml/weka/
  • Carolina Environmental Program (for more
    information) http//www.cep.unc.edu
  • Primary Author eyth_at_unc.edu
Write a Comment
User Comments (0)
About PowerShow.com