Pfizer Project - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Pfizer Project

Description:

... Layout Algorithm (by Selectivity): 5 dimensions per class ... Multiobjective Data A Selection Subset Colored by Selectivity. Data A Selection Subset: ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 36
Provided by: csU72
Category:
Tags: pfizer | project

less

Transcript and Presenter's Notes

Title: Pfizer Project


1
Pfizer Project
  • Spring 2008

2
Outline
  • The problem
  • Our approach
  • Examples of RadViz usage in single point
    optimization
  • Examples of RadViz usage in multipoint
    optimization
  • Next steps

3
The Problem
4
Our Approach
5
Intro to RadViz
6
CDK-2 Dataset
  • Original
  • 192 dimensions
  • 17080 compounds
  • Single Objective
  • 357 active compounds
  • 1458 gray compounds
  • 15265 inactive compounds

7
CDK-2 Dataset Random Layout
CDK2 dataset 192 dimensions 17080 compounds 357
active (2.1) 1458 gray (8.5) 15265 inactive
(89.4)
Random Layout
8
CDK-2 Dataset Random Layout
CDK2 dataset 192 dimensions 17080 compounds 357
active (2.1) 1458 gray (8.5) 15265 inactive
(89.4)
Random Layout
Pearson Correlations Removed 33
dimensions Cutoff is 0.8
9
CDK2 dataset 192 dimensions 17080 compounds 357
active (2.1) 1458 gray (8.5) 15265 inactive
(89.4)
Class Discrimination Layout Algorithm 24
dimensions per class
Selection 3917 compounds 172 active (4.4) 326
gray (8.3) 3419 inactive (87.3)
10
CDK2 dataset 192 dimensions 17080 compounds 357
active (2.1) 1458 gray (8.5) 15265 inactive
(89.4)
Class Discrimination Layout Algorithm 24
dimensions per class
Selection 1018 compounds 95 active (9.3) 111
gray (10.9) 812 inactive (79.8)
11
CDK2 dataset 192 dimensions 17080 compounds 357
active (2.1) 1458 gray (8.5) 15265 inactive
(89.4)
Class Discrimination Layout Algorithm 24
dimensions per class
Selection 162 compounds 17 active (10.5) 18
gray (11.1) 127 inactive (78.4)
12
(No Transcript)
13
K-Means 5 and Scaffold Entire Dataset
14
K-Means 10 and Scaffold Entire Dataset
15
K-Means 14 and Scaffold Entire Dataset
16
CDK-2 Medium Size Selection Subset
CDK-2 Selection 1018 compounds 95 active
(9.3) 111 gray (10.9) 812 inactive (79.8)
Class Discrimination Layout Algorithm 12
dimensions per class
17
CDK-2 Medium Size Selection Subset
CDK-2 Selection 1018 compounds 95 active
(9.3) 111 gray (10.9) 812 inactive (79.8)
Class Discrimination Layout Algorithm 12
dimensions per class
Multi-Column Flattening
18
CDK-2 Medium Size Selection Colored by K-Means
10 Clusters
CDK-2 Selection 1018 compounds 95 active
(9.3) 111 gray (10.9) 812 inactive (79.8)
Class Discrimination Layout Algorithm 12
dimensions per class
All Records Shown
19
CDK-2 Medium Size Selection Colored by K-Means
10 Clusters
CDK-2 Selection 1018 compounds 95 active
(9.3) 111 gray (10.9) 812 inactive (79.8)
Class Discrimination Layout Algorithm 12
dimensions per class
Active Records Shown
20
CDK-2 Medium Size Selection Colored by K-Means
10 Clusters
CDK-2 Selection 1018 compounds 95 active
(9.3) 111 gray (10.9) 812 inactive (79.8)
Class Discrimination Layout Algorithm 12
dimensions per class
Inactive Records Shown
21
K-Means 10 Cluster Distribution Entire Dataset
22
K-Means 10 and Scaffold CDK-2 Medium Size
Selection Subset
23
CDK-2 Medium Size Selection Colored by Scaffold
CDK-2 Selection 1018 compounds 95 active
(9.3) 111 gray (10.9) 812 inactive (79.8)
Class Discrimination Layout Algorithm 12
dimensions per class
All Records Shown
24
CDK-2 Medium Size Selection Colored by Scaffold
CDK-2 Selection 1018 compounds 95 active
(9.3) 111 gray (10.9) 812 inactive (79.8)
Class Discrimination Layout Algorithm 12
dimensions per class
Active Records Shown
25
CDK-2 Medium Size Selection Colored by Scaffold
CDK-2 Selection 1018 compounds 95 active
(9.3) 111 gray (10.9) 812 inactive (79.8)
Class Discrimination Layout Algorithm 12
dimensions per class
Inactive Records Shown
26
Scaffold Distribution (0-14, 16-21) Entire
Dataset
Scaffold 00 12716 compounds 74 active 1111
gray 11531 inactive
27
Scaffold Distribution (1-14, 16-21) Entire
Dataset
28
Multi-objective Dataset A
  • Original
  • 190 dimensions
  • 2824 total compounds
  • Preprocessed
  • Filter molecular weight lt500
  • Filter CLOGP lt5
  • 190 dimensions
  • 2505 total compounds

Selectivity 141 highly selective (5.6) 183
selective (7.3) 212 moderate (8.5) 1190 weak
(47.5) 779 very weak (31.1)
Activity 145 highly active (5.8) 178 active
(7.1) 187 moderate (7.5) 998 weak (39.8) 997
very weak (39.8)
29
Multiobjective Data A Random Layout Colored by
Activity
Multiobjective Data A 190 dimensions 2505
compounds 145 highly active (5.8) 178 active
(7.1) 187 moderate (7.5) 998 weak (39.8) 997
very weak (39.8)
Random Layout
Pearson Correlations Removed 52
dimensions Cutoff is 0.8
30
Multiobjective Data A Colored by Activity
Multiobjective Data A 190 dimensions 2505
compounds 145 highly active (5.8) 178 active
(7.1) 187 moderate (7.5) 998 weak (39.8) 997
very weak (39.8)
Class Discrimination Layout Algorithm (by
Activity) 10 dimensions per class
Selection 411 compounds 57 highly active
(13.9) 47 active (11.4) 46 moderate (11.2) 151
weak (36.7) 110 very weak (26.8)
31
Multiobjective Data A Selection Subset Colored
by Selectivity
Data A Selection Subset 411 compounds 50 highly
selective (12.2) 58 selective (14.1) 60
moderate (14.6) 195 weak (47.4) 48 very weak
(11.7)
Class Discrimination Layout Algorithm (by
Selectivity) 5 dimensions per class
32
Multiobjective Data A Selection Subset Colored
by Activity
Data A Selection Subset 411 compounds 57 highly
active (13.9) 47 active (11.4) 46 moderate
(11.2) 151 weak (36.7) 110 very weak (26.8)
Class Discrimination Layout Algorithm (by
Selectivity) 5 dimensions per class
33
Multiobjective Data A Selection Subset Colored
by Selectivity
Data A Selection Subset 411 compounds 50 highly
selective (12.2) 58 selective (14.1) 60
moderate (14.6) 195 weak (47.4) 48 very weak
(11.7)
Class Discrimination Layout Algorithm (by
Selectivity) 10 dimensions per class
Multi-Column Flattening
34
Multiobjective Data A Selection Subset Colored
by Activity
Data A Selection Subset 411 compounds 57 highly
active (13.9) 47 active (11.4) 46 moderate
(11.2) 151 weak (36.7) 110 very weak (26.8)
Class Discrimination Layout Algorithm (by
Selectivity) 10 dimensions per class
Multi-Column Flattening
35
Next Steps
Write a Comment
User Comments (0)
About PowerShow.com