Title: Data Quality Assessment: What Is It, Why Use It, and What
1Data Quality Assessment What Is It, Why Use It,
and Whats in It For Me?
- Presenters Jill Lundell, Debbie Lacroix, Berta
Oates - Date May 13, 2009
2What Is A Data Quality Assessment (DQA)?
- The scientific and statistical evaluation of
environmental data to determine if it meets the
planning objectives of the project, and thus are
the right type, quality, and quantity to support
their intended use.
3What Does That Mean?
- Data Quality Assessment is performed after the
data are collected - Data Quality Assessment should answer two primary
questions - Are the numbers reliable?
- What conclusions can be drawn from the data?
4Are The Numbers Reliable?
- Data verification and data validation are
performed to determine if the numbers are
reliable - Data Quality Assessment can also be used to
determine the reliability of an analytical method
(such as XRF) if it is built into the sampling
design
5What Conclusions Can be Drawn from the Data?
- Review the Objectives and Sampling Design
- Conduct a Preliminary Data Review
- Select a Statistical Method
- Verify the Assumptions of the Statistical Method
- Draw Conclusions from the Data
6Review the Project Objectives (Data Quality
Objectives)
- It is essential to keep in mind the primary and
secondary objectives of the project during the
entire DQA process to ensure appropriate tests
are used and applicable conclusions are drawn
from the data
7Conduct a Preliminary Data Review
- Review the validated data to determine
completeness and reliability - Construct graphs and summary statistics to get a
feel for the structure of the data - This step is essential to ensure the data user
applies the appropriate statistical tests and
methods to the data
8Select a Statistical Test or Method
- This selection should be based on the objectives
of the project - Typically there are several statistical tests or
methods that can be used to answer the questions
of the study - Results of the preliminary data review will aid
the data user in determining which tests should
be used
9Verify the Assumptions of the Statistical Test or
Method
- All statistical tests and methods have
assumptions that must be met in order to obtain
reliable and defensible results - Determine the distribution of the data and the
presence of outliers - The preliminary data analysis should provide the
information needed to determine if the
assumptions are met
10Perform the Tests and Draw Conclusions from the
Data
- Perform calculations, evaluate the results, and
draw conclusions - If project objectives are carefully defined and
the sampling design is deftly planned a great
deal more can be gleaned from the data than
discussed in the DQA guidances (EPA G-9R and
G9-S)
11Why Use DQA?
- Ensures data are used to their full extent and
appropriate decisions are made with the data - Ensures conclusions are defensible
- This is particularly important if a site is under
close scrutiny - Saves stake holders money in the long and/or
short term
12Review of A Completed DQA
- Demonstrates components of a DQA
- Highlights the importance of careful examination
and analysis of data - Provides a real-world example of the DQA process
13Site Background
- Several large soil piles were discovered at a
facility - Origin and contamination levels at site were
unknown - Area had been open to public recreational use for
several years - Litigation risk was very high
14(No Transcript)
15Primary Objectives
- Determine the nature and extent of contamination
in the soil piles and surrounding soils and to
determine the risk to human health - Determine if action is required and if so
determine the appropriate action
16Secondary Objectives
- Determine if soils in piles are different than
surrounding soils - Determine if chemicals are present that can help
predict the presence of other chemicals of
interest (indicator chemicals) - Determine the accuracy and applicability of field
methods in the area
17Secondary Objectives
- Developed as a result of having the data analysts
involved during DQO and sampling scheme
development - Data analysts proposed options to the client of
which the client was previously unaware - Several were developed to aid development of
sampling plans for neighboring cleanup sites
18Sampling Plan
- Sampling plan was developed to allow all primary
and secondary objectives to be met - Each composite sample within the cluster was
split. Laboratory analysis was performed on one
portion of the sample and field analysis was
performed on the other portion of the sample.
This allowed for a determination of how well
field screening methods performed compared to
fixed laboratory methods
19(No Transcript)
20Sampling Plan
- Additional field samples were collected between
the clusters of fixed laboratory samples to
provide additional insight along the length of
the piles - The comparison of fixed laboratory methods and
field screening methods allowed for better
interpretation of these data
21Primary Objectives
- None of the soils were contaminated to an extent
that posed a risk to human health - One point of elevated contamination was
discovered where the two piles met, but it was
defensibly determined that it also did not pose a
threat to human health - Results and methods were defensible
22Secondary Objectives
- Soils in the piles were compared to soils on the
banks of the outfall and creek as well as other
soils surrounding the piles - Soils in the piles did have higher levels of
contamination than surrounding soils however,
soil piles did not pose a threat to human health
23Secondary Objectives
- Correlation analysis was performed to determine
if indicator chemicals were present - It was of particular interest to know if the
presence of Uranium-235 or Uranium-238 could be
used to determine the presence of PCBs - A useable correlation was not found because PCBs
were detected in very few samples
24(No Transcript)
25Secondary Objectives
- Field and fixed laboratory methods were compared
to determine how field data could be used to aid
in sampling other sites - A multi-tiered approach was used to determine the
strength of the relationship between the two
methods in the area
26Field and Fixed Laboratory Results Comparison
- Correlation analysis was used to determine if
field and fixed lab methods directly correlated - False-negative and false-positive rates were
determined around field detection limits,
background levels, and no action limits - Means, standard deviations, and upper confidence
limits (UCLs) were computed from both sets of
data and compared - Bubble plots were generated to determine how
field and lab measurements corresponded in the
soils
27(No Transcript)
28(No Transcript)
29Field and Fixed Lab Methods Comparison Results
- Field results and laboratory results are not
usably correlated in a mathematical sense - Several of the analytes could not be detected at,
or below, background with field methods - Means, standard deviations, UCLs did not compare
well for the analytes of primary interest - Bubble plots indicated that for most analytes of
concern, higher concentrations in the lab methods
were associated with higher concentrations in the
field methods
30Defensibility
- Site had a high risk of litigation so
defensibility was very important! - Data were analyzed to ensure clustered design was
handled appropriately - Appropriate methods were used to handle
undetected data (using the detection limit or ½
of the detection limit for undetected values is
not a defensible method) - Outliers were identified and their impact
discussed through all phases of statistical
analysis
31Whats in it for Me?
- Saves stake holders money on current and/or
future projects - Results can withstand close scrutiny
- Reduces the risk of having to resample a site
- Minimizes the chance of remediating a clean site
or failing to remediate a contaminated area