Biostatistics Analysis Center Center for Clinical Epidemiology and Biostatistics University of Pennsylvania School of Medicine - PowerPoint PPT Presentation

1 / 9
About This Presentation
Title:

Biostatistics Analysis Center Center for Clinical Epidemiology and Biostatistics University of Pennsylvania School of Medicine

Description:

Biostatistics Analysis Center Center for Clinical Epidemiology and Biostatistics University of Pennsylvania School of Medicine Minimum Documentation Requirements – PowerPoint PPT presentation

Number of Views:232
Avg rating:3.0/5.0
Slides: 10
Provided by: CCEB
Category:

less

Transcript and Presenter's Notes

Title: Biostatistics Analysis Center Center for Clinical Epidemiology and Biostatistics University of Pennsylvania School of Medicine


1
Biostatistics Analysis CenterCenter for
Clinical Epidemiology and BiostatisticsUniversity
of Pennsylvania School of Medicine
  • Minimum Documentation Requirements
  • Amy Praestgaard
  • March 6, 2008

2
Project Directory File Structure
  • Located on the network to ensure regular back-up.
  • Name of directory usually includes project number
    (e.g., BIO999 or EPI888)
  • Has (at least) the following subdirectories
  • Documentation
  • Data
  • Original
  • Analytic
  • Programs
  • Current
  • Archived
  • Results
  • Current
  • Archived

3
Documentation Subdirectory
  • Contact sheet (REQUIRED)
  • Must include names and contact information for
    PI, Biostatistics faculty members, and BAC staff
    members.
  • Project registration is sufficient for many
    projects.
  • Larger projects may include need to include
    project managers, research assistants, and CRCU
    contacts.
  • BAC Project Plan (REQUIRED -- will overlap with
    overall project plan)
  • Study objective(s), Outcome definition, Statement
    of analytic methodology (analysis plan). At
    minimum, a note in the issues log should contain
    the BAC project goal, even if as simple as this
    project converts a SAS file to an SPSS file
  • Analytic efforts related to timeline for
    abstracts, meetings, manuscripts.
  • Study information (STRONGLY RECOMMENDED)
  • Study protocol or grant proposal, data collection
    forms, meeting minutes or summaries, To-do lists
    (if applicable), all abstract and manuscript
    drafts.
  • Issues log (STRONGLY RECOMMENDED)
  • Includes brief description, priority, and
    estimated resolution date.
  • Keep resolved issues on the list, noted with date
    resolved.

4
Data SubdirectoryBIO999\Data
  • Good idea to separate the original data from the
    created
  • Original (Raw) Data BIO999\Data\Original
  • Original versions of source data (e.g., DVD,
    external hard drive, Excel spreadsheet) must be
    stored in a secure location, like a locked desk
    drawer.
  • An electronic gold copy of the source data must
    be retained in the data subdirectory. It may not
    hurt to include GOLD in the file name.
  • Analytic Data BIO999\Data\Analytic
  • Original extract from source data
  • Cleaned data, in original format
  • Incorporates data changes, deletions
  • Does not include derived variables
  • Final analytic data set, including derived
    variables

5
Data Dictionary
  • At minimum, must contain
  • Results from PROC CONTENTS on the analytic
    dataset containing labeled variables.
  • Results from PROC MEANS that includes N for each
    variable.
  • A separate document may be necessary for some
    projects.
  • If the data base is very large, spend billable
    time creating labels only for the variables you
    will be using for the analysis unless the client
    specifically asks you to do it for all variables.

6
Program headers
  • All programs have a header with
  • program name,
  • purpose,
  • location,
  • project name,
  • faculty and BAC name,
  • input, output,
  • last modified,
  • and other relevant details

7
Code Documentation
  • Code should have a sufficient number of comments,
    especially on macros, complex merges, and code
    for analysis
  • See the Code Complete book for a fuller
    discussion of good documentation
  • Within a program, the following types of code
    must be annotated with a description of purpose
    and expected outcome
  • All macros or functions
  • Any code generating a data set that is to be
    retained for interim or final analysis
  • Any code generating output (e.g., frequency
    tables, test statistics, p-values) for an interim
    or final deliverable.
  • Any code used for data validation, data changes,
    and derived variable calculation.

8
Program Archiving
  • During active analytic periods, log and listing
    files must be archived as follows
  • With date clearly indicated in the file name, as
    in gmm_8_8Dec07.log and gmm_8_8Dec07.lst.
  • In a separate archive or history section of
    the appropriate BAC project folder
  • At least weekly, and more frequently if
    substantial changes are made.
  • Log and list files can be run in Windows, no need
    to use unix to generate them
  • Programs generally should have a date in their
    names in order to keep track of versions

9
Documenting Deliverables
  • All interim and final deliverables must include
    the following information
  • Analyst name
  • Deliveree
  • Date produced (good practice to include in file
    name)
  • Input data, output data, and program location
  • Statistical software (including version) used to
    produce results
  • Once distributed, the deliverable will be
    retained in the appropriate project folder
Write a Comment
User Comments (0)
About PowerShow.com