SMD Data Analysis Tutorial presentation

About This Presentation

Transcript and Presenter's Notes

Title: SMD Data Analysis Tutorial

1
SMD Data Analysis Tutorial

April 7, 2009
Catherine Ball
(ball_at_genome.stanford.edu)
Janos Demeter (jdemeter_at_genome.stanford.edu)

2
SMD Getting Help

Click on the Help menu
Tool-specific links will be listed at the top.
Use the SMD help index to look for specific
subjects
Send e-mail to
array_at_genome.stanford.edu

3
You will learn

How to use SMD Data Analysis Pipeline
Gene Selection and Annotation
Data Filtering
Data Retrieval
Gene Filtering
Clustering and Image Generation
How to use SMDs data repository
How to use SMDs implementation of the
GenePattern data analysis suite

4
Data Retrieval and Analysis

Experiment names will be listed with feature
extraction software indicated.

5
Gene Selection and Annotation

Specify genes or clones
Collapse data by SUID or LUID
Determine UID column
Choose biological annotation
Label result set

6
Gene Selection All genes

Ten arrays
All genes
8690 Biosequence IDs used in cluster
Using all genes results in a very long cluster!

7
Gene Selection Specify Genes or Clones

Use all genes or clones on an array
Select a Genelist from your loader.stanford.edu
account
Enter a list of genes to select. The names
should be separated by two colons
Optionally include controls and empty spots.

8
Gene Selection Genelists

Ten arrays
500-gene genelist
380 Biosequence IDs used for cluster
Using a genelist limits the data analyzed to a
subset of genes

9
Gene Selection Retrieving and Collapsing Data

Collapse or averaging occurs within each
individual array. Multiple instances of the same
entity will be combined as specified.
Duplicated entities can be defined in three ways
Biosequence ID is the identifier for the molecule
in SMD.
Laboratory Unique ID is the identifier for the
source of the sample in the lab.
SPOT is a individual feature on a print.

10
Gene Selection Collapse by SUID

Ten arrays
500 gene genelist
Data retrieved by Biosequence ID
380 Biosequence IDs used for cluster
duplicated spots will be averaged

11
Gene Annotation Biological Annotation

The list includes all information stored within
SMD for any gene from the organism in question.
Not all genes will have all annotations.
Annotations from a genelist (if one was selected)
can be used to describe the genes.

12
Array Annotation Name Choices

Arrays (hybridizations) are identified in SMD by
slide name (e.g., serial number) and experiment
name, both unique.
Agilent and Affymetrix data sets are further
identified by a result set name possibly more
than one per hybridization, and not guaranteed to
be unique.

13
SMD Data Analysis Tutorial

How to use SMD Data Analysis Pipeline
Gene Selection and Annotation
Data Filtering
Data Retrieval
Gene Filtering
Clustering and Image Generation
How to use SMDs data repository
How to use SMDs implementation of the
GenePattern data analysis suite

14
Data Filtering

Choose data column to retrieve
Elect to invert reverse dye replicates
Elect to filter by spot flag
Select spot criteria for filtering
Define image presentation options
Retrieve data in background (not shown) - goes to
repository

15
Data Filtering Choose Data to Retrieve

You can retrieve and cluster any numerical
measurement from your data.
Clustering doesnt necessarily make sense for all
fields.
Default (and most appropriate) fields for
clustering are log ratio (two-channel data) and
signal or intensity (single-channel data).

16
Data Filtering Selecting Filtering Criteria

Each spot will be individually assessed as
specified, prior to any averaging or collapse.
Each filter can be made active and customized as
desired.
Filters can be combined using logical operators
(filter string), defaulting to a logical AND.
Filters available will be appropriate to the
feature extraction software used.

17
Data Filtering Default Spot Filters

Regression correlation measures pixel-by-pixel
agreement between the two channels.
Foreground/Background intensities are a simple
measure of signal to noise.
Absolute intensity cutoffs impose a minimum net
signal.
Failed and Is Contaminated refer to the
quality of the spot material.
Equivalent defaults are presented for Agilent
data.
Affymetrix data can be filtered on detection,
detection p-value, etc.
Any data, including biological annotations, can
be used for customized filters.

18
Spots with low regression correlation
19
Data Filtering Regression Correlation

Ten arrays
500 gene Genelist
Spot flag 0
Regression correlation gt 0.6
380 Biosequence IDs used for filtering
Filtering away spots with low regression
correlation removes many spots

20
Data Filtering Combinations of Filters

Ten arrays
500-gene genelist
Regression correlation gt 0.6
Net intensity in each channel gt 350
371 Biosequence IDs selected for clustering
This data set was formed by selecting spots that
are good quality (via the regression correlation)
and good intensity in both channels

21
Data Filtering Image Presentation Options

Retrieve spot coordinates will allow you to see
an assembled image of each array after
clustering.
Show all spots allows you to view the spots you
filtered out (in addition to the ones that passed
filtering) after clustering. This might slow
down data retrieval.

22
SMD Data Analysis Tutorial

How to use SMD Data Analysis Pipeline
Gene Selection and Annotation
Data Filtering
Data Retrieval
Gene Filtering
Clustering and Image Generation
How to use SMDs data repository
How to use SMDs implementation of the
GenePattern data analysis suite

23
Data Filtering Retrieve Data in Background

Long running data retrieval jobs can be submitted
and youll be e-mailed with a progress report.
Data sets will be saved to your data repository.

24
Data Retrieval

General results and progress
PreClustering (.pcl) file
Data retrieval summary report
Option to deposit data in repository

.
25
Data Retrieval Summary
26
SMD Data Analysis Tutorial

How to use SMD Data Analysis Pipeline
Gene Selection and Annotation
Data Filtering
Data Retrieval
Gene Filtering
Clustering and Image Generation
How to use SMDs data repository
How to use SMDs implementation of the
GenePattern data analysis suite

27
Gene Filtering

Transform single-channel data
Filter genes based on data distribution
Data centering
Filter genes based on data values
Filter genes and arrays based on spot filter
criteria
Zero-transform data

28
Gene Filtering Transformation

Single-channel (e.g., Affymetrix) data only.
Adjust arrays for simple cross-array comparison.
Log-transform data for clustering.
May add a constant for variance stabilization
May replace non-positive values with very small
values

29
Gene Filtering Data Distribution

Rank will select genes whose retrieved value is
in the top Nth percentile.
Deviations selects those genes whose retrieved
value has a value significantly above or below
the mean.

30
Gene Filtering Percentile Rank

Ten arrays
500-gene genelist
Regression correlation gt 0.6
Net intensity in either channel gt 350
Rank gt 95 in at least one array
Many data are removed, since only those that were
very intense in the yellow (red) channel are
included.

31
Gene Filtering Deviation from Mean Value

Ten arrays, 500-gene genelist
Regression correlation gt 0.6
Net intensity in either channel gt 350
Genes whose Log(Normalized Red/Green) is more
than one standard deviation from mean in at least
one array
This filter removes data that do not show
significant variance from the mean a good way
to identify genes with potentially interesting
behavior.

32
Gene Filtering Centering Data

Data can be centered at this stage. This
transforms the data so that the mean value is
equal to zero. Images and downloaded files will
reflect this transformation.
During clustering, data can be treated as if they
were centered, but the values of the data are not
affected.
Gene centering is useful for common references.
Array centering amounts to renormalizing each
array, using the spots that pass the spot filter
criteria.

33
Data Centering

Centering sets the average value of a vector to
zero.
This results in a loss of information, but may
reveal important patterns.

34
Data Centering

Gene centering is useful when the actual value of
the ratio is not important or is not meaningful
(e.g., common reference).
Centering is generally not appropriate when using
a biologically meaningful control sample, such as
a matched, untreated sample, or a zero timepoint.

35
Data Transformation Centering

To illustrate how centering affects data, a small
sample of data were duplicated. A constant was
added to the second copy of each row

36
Uncentered Data, No Centering Metric During
Clustering
Uncentered Data, Centering Metric During
Clustering
Centered Data, No Centering Metric During
Clustering
Centered Data, Centering Metric During Clustering
37
Gene Filtering Center Genes
Centered
Uncentered

Ten arrays, 500-gene genelist
Regression correlation gt 0.6
Net intensity in either channel gt 350
Genes centered
No effect on number of biosequence IDs clustered,
but data values are changed (centered data is
displayed on left)

38
Gene Filtering Data Values

Cutoff requires data to exceed a user-defined
value in at least A arrays. Think hard before
using this filter. Especially when data are
centered, you could be losing important
information.

39
Gene Filtering Spot Filter Criteria

Genes can be screened out if they do not meet the
spot criteria a given percentage of the time, as
specified by the user.
Arrays can be similarly filtered out if they do
not meet the spot filter criteria.

40
Spot Filtering vs. Gene Filtering
Gene filters remove the genes that do not meet
the filter criteria often enough. This reduces
the number of genes.
Spot filters remove individual data points. That
means there will be more missing (gray) data.
41
Gene Filtering Zero Time Point Transformation

Data can be transformed by subtracting one state
of a series from all other data

42
Gene Filtering Zero Time Point Transformation
Subtract the values from the first time point
from all the other time points
43
Gene Filtering Results

Download PreClustering files (.pcl)
Go to GenePattern
Summary report
Deposit to repository
Another round of filtering
Proceed to clustering

44
Gene Filtering Data Retrieval Summary Report
45
SMD Data Analysis Tutorial

How to use SMD Data Analysis Pipeline
Gene Selection and Annotation
Data Filtering
Data Retrieval
Gene Filtering
Clustering and Image Generation
How to use SMDs data repository
How to use SMDs implementation of the
GenePattern data analysis suite

46
Clustering and Image Generation

Partitioning options
Clustering metric selections
Correlated genes
Image generation options

47
Clustering Algorithms
In microarray studies, we often use clustering
algorithms to help us identify patterns in
complex data. For example, we can randomize the
data used to represent this painting and see if
clustering will help us visualize the pattern.
48
Clustering algorithms
The painting is sliced into rows which are then
randomized.
49
Clustering algorithms
Rows ordered by hierarchical clustering with
nodes flipped to optimize ordering
50
How do we compare expression profiles?

Treat expression data for a gene as a
multidimensional vector.
Decide on a distance metric to compare the
vectors.
Plenty to choose from
Pearson correlation, Euclidean Distance,
Manhattan Distance etc.

51
Expression Vectors

Crucial concept for understanding clustering
Each gene is represented by a vector where
coordinates are its values (log(ratio)) in each
experiment
x log(ratio)expt1
y log(ratio)expt2
z log(ratio)expt3
etc.

52
Clustering Metric Selections

Genes and arrays can be clustered.
Pearson correlation treats vectors as if they
were the same (unit) length.
Euclidean distance will be affected by both the
direction and the amplitude of the vectors.

53
Distance Metrics

Distances are measured between expression
vectors
Distance metrics define the way we measure
distances
Many different ways to measure distance
Euclidean distance
Pearson correlation coefficient(s)
Manhattan distance
Mutual information
Kendalls Tau
etc.
Each has different properties and can reveal
different features of the data

54
Euclidean Distance

The Euclidean distance metric detects similar
vectors by identifying those that are closest in
space. In this example, A and C are closest to
one another.

55
Pearson Correlation

The Pearson correlation disregards the magnitude
of the vectors but instead compares their
directions. In this example, Gene A and Gene B
have the same slope, so would be most similar to
each other.

56
Distance Metric Pearson vs. Euclidean
C
A
B

By Euclidean distance, A and C are most similar.
By Pearson correlation, A and B are most similar.

57
Clustering Tree Displays

Clustered gene arrays are displayed adjacent to
most similar arrays.
The nodes of the trees indicate the members of an
array and the degree of similarity to its
neighbor.

58
Hierarchical Clustering

Calculate the distance between all genes. Find
the smallest distance. If several pairs share the
same similarity, use a predetermined rule to
decide between alternatives.
Fuse the two selected clusters to produce a new
cluster that now contains at least two objects.
Calculate the distance between the new cluster
and all other clusters.
Repeat steps 1 and 2 until only a single cluster
remains.
Draw a tree representing the results.

59
Clustering Array Clustering
No Array Clustering
With Array Clustering
60
Clustering Self Organizing Maps

Map of n partitions, that is modeled on the
expression data, where each partition in the map
has an associated vector
Genes are assigned to partitions of most similar
genes
Neighboring partitions are more similar to each
other than they are to distant partitions

61
Clustering Correlated Genes

SMD can produce a file listing the
best-correlated genes, for each gene retrieved.

62
Clustering Visualization

Click on the image to get a dynamic display.
Click on the TreeView button for another dynamic
option.
Click on one of the other options to see static
displays with or without the spot images.
Download files (.cdt, .atr, .gtr, report) for use
with other tools (e.g., TreeView).
Add cluster or pre-clustering file to your
repository

63
Clustering Display Adjacent Cluster and
Clustered Spot Images
64
Clustering Display Hierarchical Cluster View

Interactive view of cluster
Link to GO term analysis (green nodes) to
evaluate sub-clusters.

65
SMD Data Analysis

Using SMD Data Analysis Pipeline
Repository Tools
SVD
Synthetic Gene Tool
kNNimpute
GenePattern tools

66
SMD Help File Formats
67
File Formats Pre-clustering (PCL) File
Names and orders of arrays (if arrays are not
clustered)
68
File Formats Clustered Data Table (CDT) File
69
File Formats Gene Cluster Text (GCT) File
70
File Formats Class (CLS) File
71
Using Your Repository PCL Deposits
72
Using the Repository CDT File Options
CDT files have a few other options
GeneXplorer
Clustering with Proxy and Spot images
TreeView
Clustering with Spotimages
Clustering with Proxy images
73
Viewing Repository Entries

Name
Organism
Number of genes
Number of arrays
Size of file
Date uploaded
Description
Data retrieval summary

74
Editing Entries -- How to Share!

Change repository entry name
Change description
Add access to repository entry to a GROUP
Add access to a repository entry to a SMD USER

75
SMD Data Analysis

Data Analysis Background
Clustering algorithms
Data centering
Using SMD Data Analysis Pipeline
Gene Selection and Annotation
Data Filtering
Data Retrieval
Gene Filtering
Clustering and Image Generation
Repository Tools
SVD
Synthetic Gene Tool
kNNimpute
GenePattern tools

76
SVD Singular Value Decomposition

The goal of SVD is to find a set of patterns that
describe the greatest amount of variance in a
dataset
SVD determines unique orthogonal (or
uncorrelated) gene and corresponding array
expression patterns (i.e. "eigengenes" and
"eigenarrays," respectively) in the data
Patterns might be correlated with biological
processes OR might be correlated with technical
artifacts

77
SVDmethod
78
SVD Display in SMD
79
SMD Data Analysis

Data Analysis Background
Clustering algorithms
Data centering
Using SMD Data Analysis Pipeline
Gene Selection and Annotation
Data Filtering
Data Retrieval
Gene Filtering
Clustering and Image Generation
Repository Tools
SVD
kNNimpute
Synthetic Gene Tool
GenePattern tools

80
KNNImpute The Missing Values Problem

Microarrays can have systematic or random missing
values
Some algorithms arent robust to missing values
Large literature on parameter estimation exists
Whats best to do for microarrays?

81
KNNimpute Algorithm

Idea use genes with similar expression profiles
to estimate missing values

82
SMD Data Analysis

Data Analysis Background
Clustering algorithms
Data centering
Using SMD Data Analysis Pipeline
Gene Selection and Annotation
Data Filtering
Data Retrieval
Gene Filtering
Clustering and Image Generation
Repository Tools
SVD
kNNimpute
Synthetic Gene Tool
GenePattern tools

83
Synthetic Genes

Purpose
average data based on arbitrary groupings of
genes/probes
- for biological reasons
- for technical reasons
Can average data using
- common genelists
- your own genelists
- annotations in pcl file
After averaging
- a new row for the synthetic gene data
- Original data can be removed/included

84
Synthetic Genes

Common lists available (only mouse and human
data)
Unigene (all clones/oligos that report on a given
Unigene id will be averaged and shown as the
Unigene id)
Entrez Geneid (same as above, but for Entrez
Geneid)
These lists are useful to collapse data by gene,
rather than biosequenceid/luid.
They allow comparison of experiments between
different platforms - oligo print to cDNA print
or spotted arrays to Agilent arrays where the
arrays dont share common reporters. Also can be
used to compare cDNA prints with h/meebo arrays
These synthetic gene lists are updated on a
regular basis.

85
Synthetic Genes

Other common synthetic gene lists
chromosome arms
cytobands
5 Mb tiles based on GoldenPath mappings
Tissue types
tumor types
processes
Additional lists see
http//smd.stanford.edu/help/synthGenes.shtml

86
SMD Data Analysis

Using SMD Data Analysis Pipeline
Repository Tools
SVD
Synthetic Gene Tool
kNNimpute
GenePattern tools

87
What is GenePattern?

Software package developed at Broad Institute
(Jill P. Mesirovs group)
http//www.broad.mit.edu/cancer/software/genepatt
ern/
Reasons to choose this package
Large number of microarray analysis tools (gt90)
Ability to create pipelines (reproducible
research)
Ease of adding new modules to existing ones

88
How to find GP in SMD?
From Data retrieval
From repository
89
Terms in GenePattern

Module (Analysis/Visualization/Utility) program
that does analysis, displays or executes some
other transformation of a file
Pipelines chained modules - output from one -gt
input to next
Suites groupings of modules/pipelines
Jobs
execution of module/pipeline
persistent
results are deleted after one week
Go to web-site for navigation

90
GenePattern comments

SMD uses pcl
GenePattern uses gct (among others )
Converters gct -gt pcl pcl -gt gct
Most tools in GenePattern need full dataset - Use
ImputeMissingValuesKNN first
Most default values are designed for Affymetrix
data - evaluate each option carefully
(GeneCruiser)

91
Input/output files in GenePattern

Called through specific pcl file
Files in your repository
Upload data from desktop
Any file that has a url
Module to get data directly from geo

92
GenePattern modules by category Clustering

Clustering
Hierarchical clustering/HierarchicalClusteringPCL
Self-organizing maps (SOM)
K-means clustering
Non-negative matrix factorization (NMF) (Brunet
et al., 2004) is an alternative method for class
discovery. Rather than clustering genes, NMF
detects context-dependent patterns of gene
expression. Requires all positive values.
Consensus clustering (Monti et al., 2003) runs a
selected clustering algorithm against
perturbations of the original data set. The
result is a consensus matrix that assesses the
stability of discovered clusters. Supported
clustering methods hierarchical clustering,
K-means clustering, self-organizing maps (SOM),
and non-negative matrix factorization (NMF).
Clusters genes or samples, not both
SubMap (Hoshida Y, et al. PLoS ONE 2(11) e1195,
2007 ) is an unsupervised method, which estimates
the significance of an association between
subclasses observed in two independent data sets.
The subclass labels are predetermined as manually
assigned phenotypes or by clustering prior to the
application of the SubMap algorithm.
Corresponding visualizers
HierarchicalClusteringViewer
SOMClusterViewer
HeatMapViewer
Etc
Clustering result example

93
GenePattern modules by category Marker Selection

ComparativeMarkerSelection (similar to SAM)
ComparativeMarkerSelectionViewer
Visualize and explore data produced by the method
ExtractComparativeMarkerResults extract data
based on the analysis, create genelist
Gene Set Enrichment Analysis (GSEA)
GSEALeadingEdgeViewer

94
GenePattern modules Marker Selection

Goal Given phenotypically distinct classes, find
markers with distinct expression patterns (in
different classes)

95
GenePattern modules Marker Selection

Visualize result using ComparativeMarkerSelectionV
iewer

96
GenePattern modules GSEA

a method to determine whether an a priori defined
set of genes shows statistically significant,
concordant differences between two biological
states
Very similar to comparative marker selection
sets rather than genes

97
GenePattern modules GSEA

Molecular Signatures DB
http//www.broad.mit.edu/gsea/msigdb/index.jsp
Gene sets groups of gene symbols
Gene sets are versioned

98
GenePattern modules GSEA

Requirements
Expression dataset
Class file
Chip file

Number of classes
Number of slides
99
GenePattern modules GSEA

How it works
Sorts rows based on how well a metric correlates
with the class assignment (similar to marker
selection tool)
Scores gene sets (using a scoring method) by
walking down the ranked list of genes, increasing
a running-sum statistic when a gene is in the
gene set and decreasing it when it is not.

100
GenePattern modules GSEA

Output results can be viewed in web-browser
Further analysis GSEALeadingEdgeViewer

101
Create pipeline/suite

Pipeline can be created from a path
by concatenating individual modules
From zip file
Pipelines can be exported
Pipelines can be private or public

102
Creating new modules

Smd curators/programmers can create/upload new
modules
If you have any programs you would like to share
or use in smd, please let us know

103
GenePattern modules Class Prediction

Goal Given phenotypically distinct classes, find
a gene expression signature that accurately
predicts class membership.
Computational methodology divide data into
training and test sets
Goal
achieve high predictive power
Avoid over-fitting

104
GenePattern modules Class Prediction

Simple example
Knn classifier,
k5, 2 genes, 2 classes

105
GenePattern modules Class Prediction

Simple example
Knn classifier,
k5, 2 genes, 2 classes

106
GenePattern modules Class Prediction

Simple example
Knn classifier,
k5, 2 genes, 2 classes

107
GenePattern modules Class Prediction

Evaluation on independent test set
Build the classifier on the train set.
Assess prediction performance on test set.
Maximize generalization/Avoid overfitting.
Performance measure
error rate

108
GenePattern modules Class Prediction

K-nearest-neighbors (KNN) classifies an unknown
sample by assigning it the phenotype label most
frequently represented among the k nearest known
samples (Golub and Slonim et al., 1999). In
GenePattern, the user selects a weighting factor
for the 'votes' of the nearest neighbors
(unweighted all votes are equal weighted by the
reciprocal of the rank of the neighbor's
distance the closest neighbor is given weight
1/1, next closest neighbor is given weight 1/2,
and so on or weighted by the reciprocal of the
distance).
Weighted Voting (Slonim et al., 2000) classifies
an unknown sample using a simple weighted voting
scheme. Each gene in the classifier 'votes' for
the phenotype class of the unknown sample. A
gene's vote is weighted by how closely its
expression correlates with the differentiation
between phenotype classes in the training data
set.
Support Vector Machines (SVM) is designed for
multiple class classification (Rifkin et al.,
2003). The algorithm creates a binary SVM
classifier for each class by computing a maximal
margin hyperplane that separates the given class
from all other classes that is, the hyperplane
with maximal distance to the nearest data point.
The binary classifiers are then combined into a
multiclass classfier. For an unknown sample, the
assigned class is the one with the largest
margin.
CART (Breiman et al., 1984) builds Classification
And Regression Trees for predicting continuous
dependent variables (regression) and categorical
predictor variables (classification). It works by
recursively splitting the feature space into a
set of non-overlapping regions and then
predicting the most likely value of the dependent
variable within each region. A classification
tree represents a set of nested if-then
conditions that allows for the prediction of the
value of the categorical dependent variable based
on the observed values of the feature variables.
A regression tree is similar but allows for the
prediction of the value of a continuous dependent
variable instead.

109
GenePattern modules Pathway Analysis

ARACNE (Algorithm for the Reconstruction of
Accurate Cellular Networks) (Margolin, A., et
al., BMC Bioinformatics, 2006. 7(Suppl 1) p.
S7.) is an algorithm which reverse engineers a
gene regulatory network from microarray gene
expression data. It attemps to predict targets of
select transcription factors from a microarray
dataset.
MINDY (Modulator Inference by Network Dynamics)
algorithm computationally infers genes that
modulate the activity of a transcription factor
at post-transcriptional levels (Wang, et. al.
,2006). The algorithm uses mutual information
(MI) to measure the mutual dependence of the
transcription factor (TF) and its target gene to
predict modulators of TF activity.

110
GenePattern modules Survival Analysis

SurvivalCurve Draws survival curve based on cls
file
SurvivalDifference tests if there is a
difference between two or more survival curves
based on sample classes defined by genomic data.
The log-rank test (Mantel-Haenszel test) and the
generalized Wilcoxon test can be used.

111
GenePattern modules

Many other modules
Projection methods PCA, NMF (Non-negative Matrix
Factorization)
Tools for snp analysis
Tools for proteomics data
Etc

112
SMD Office Hours

Grant S201
Mondays 3 - 5 pm
Wednesdays 2 - 4 pm

113
SMD Staff
Gavin Sherlock Co-Investigator
Catherine Ball Director
Patrick Brown Co-Investigator
Farrell Wymore Lead Programmer
Michael Nitzberg Database Administrator
Zac Zachariah Systems Administrator
Tatiparthy Reddy Scientific Curator
Janos Demeter Computational Biologist
Heng Jin Scientific Programmer
Maria Mao Software Engineer
Jeremy Hubble Scientific Programmer

Write a Comment

User Comments (0)

About PowerShow.com

SMD Data Analysis Tutorial PowerPoint PPT Presentation