CORALSEA - PowerPoint PPT Presentation

About This Presentation
Title:

CORALSEA

Description:

Workflow It will appear at the screen After these actions, file model/Output.txt will contain results of calculation for compounds from MyInput.txt Click ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 42
Provided by: VittorioCa
Category:
Tags: coralsea | carlo | method | monte

less

Transcript and Presenter's Notes

Title: CORALSEA


1
CORALSEA
Workflow
2
The software CORALSEA is a tool to build up the
quantitative structure property / activity
relationships (QSPRs/QSARs)
  • The representation of the molecular structure
    that is used in the CORALSEA is SMILES
  • simplified molecular input-line entry system
  • For details, please see
  • http//www.daylight.com/dayhtml/doc/th
    eory/theory.smiles.html

3
Here we used for the demo of CORALSEA our
model from article THE DEFINITION OF THE
MOLECULAR STRUCTURE FOR POTENTIAL ANTI-MALARIA
AGENTS BY THE MONTE CARLO METHOD Struct. Chem.
2013 2413691381 You can develop a better
model , but now please follow our suggestions.
4
The first action is the preparation of SMILES
file which is the input for CORALSEA
Each compound should be represented by (1) The
type,-, (2) The ID it can be CAS
(chemical abstract service) or a number (3)
SMILES and (4) Endpoint value. is
indicator of sub-training set - is indicator
of calibration set is indicator of test
set. The role of sub-training set is developer
of model The role of calibration set is critic
of model The role of test set is estimator of
model.
  • 1 COc1ccc2c(c1)NC(C)C(CCCCCCC)C2O 7.332
  • 2 COc1ccc2c(c1)NC(C)CC2O 4.903
  • 3 OC1c2ccccc2NC(C)C1CCCCCCC 6.979
  • 4 OC1c2ccccc2NC(C)C1CCCCCCCCC 7.400
  • 5 OC1c3ccccc3NC(C)C1C2CCCCC2 5.652
  • -6 OC1c3ccccc3NC(C)C1c2ccccc2 6.270
  • 7 OC2c3ccccc3NC(C)C2Cc1ccccc1 5.207
  • 8 OC1c2ccccc2NC(C)C1Br 7.110
  • -9 OC1c2ccccc2NC(C)C1\CC\CCCCCCC 7.824
  • 10 CC(CCCCCCC)C1C(O)c2ccccc2NC1C 7.472
  • 12 OC2c3ccccc3NC(C)C2/CC/c1ccccc1 5.827
  • 13 COc1ccc2NC(C)C(Br)C(O)c2c1 5.934
  • -14 Cc1ccc2NC(C)C(Br)C(O)c2c1 6.583
  • 15 Brc1ccc2NC(C)C(Br)C(O)c2c1 6.470
  • 17 Fc1ccc2NC(C)C(Br)C(O)c2c1 6.903
  • 18 Clc1ccc2NC(C)C(CCCCCC)C(O)c2c1 4.336
  • 19 COc2cccc3NC(C)C(Cc1ccccc1)C(O)c23 5.675
  • -21 COc1ccc3c(c1)NC(C)C(Cc2ccccc2)C3O 5.859
  • -22 COc1cccc2NC(C)C(C(O)c12)c3ccccc3 5.295

MyFile.txt
5
It is a good idea to reserve some substances as
"invisible" validation set for final estimation
of the model
  • Format of file for this validation is the
    following
  • The number of compounds
  • (2) list of compounds in the above-mentioned
    format type-ID-SMILES-Endpoint values.
  • 10
  • 11 OC1c2ccccc2NC(C)C1C\CC\CCCCCC 6.728
  • 16 Clc1ccc2NC(C)C(Br)C(O)c2c1 6.900
  • 20 COc2ccc3NC(C)C(Cc1ccccc1)C(O)c3c2 4.624
  • 27 Clc1ccc3c(c1)NC(C)C(Cc2ccccc2)C3O 4.805
  • 32 Clc1cc2c(cc1Cl)NC(C)C(C2O)c3ccccc3
    6.456
  • 40 Clc1cc2c(cc1OC)NC(C)C(CC)C2O 7.559
  • 42 Clc1cc2c(cc1OC)NC(C)C(CCCCCCC)C2O 8.530
  • 43 Clc1cc2c(cc1OC)NC(C)C(CCCCCCCCC)C2O
    8.779
  • 51 CC(CCCCC)C1C(O)c2cc(Cl)c(cc2NC1C)OC
    7.830
  • 52 Clc1cc2c(cc1OC)NC(C)C(\CC\CCCCC)C2O
    7.975

MyInput.txt
6
In order to start your work you must download
CORALSEA.zip from www.insilico.eu/coral When
it is done, you must insert folder "CORALSEA" in
your computer
7
and insert your data (i.e. MyTRNCLBTST.txt)
in folder MyCORALSEA
8
Containing of MyCORALSEA is the following
9
In order to carry out QSPR/QSAR analysis of data
represented for CLASSIFICATION MODEL one should
do the following
  • Insert TRNCLBTST-1.txt in the folder
  • Insert Input-1.txt in the folder.
  • Click CORALSEA.exe.

TRNCLBTST.txt-is file which contains training
(TRN), calibration(CLB) ,and test(TST)
sets Input.txt is data which are not visible
during building up model
10
It appears in your screen
Click Button Load method
11
It appears in your screen
1
3
2
Insert name TRNCLBTST-1.txt in text box
12
It appears in your screen
Click SAVE SYSTEM
13
It appears in your screen
Restart program and Click Load system
14
It appears in your screen
Click OK
15
It appears in your screen
This plot relates to the external invisible
validation set
16
It appears in your screen
File Output-1.txt contains statistical
characteristics for the validation set
(Output-1.txt is placed in folder Model)
17
In order to carry out QSPR/QSAR analysis of data
represented for REGRESSION MODEL one should do
the following
  • Insert TRNCLBTST.txt in the folder
  • Insert Input-1.txt in the folder.
  • Click CORALSEA.exe.

TRNCLBTST.txt-is file which contains training
(TRN), calibration(CLB) ,and test(TST)
sets Input.txt is data which are not visible
during building up model
18
It appears in your screen
INSERT
SELECT
Insert name TRNCLBTST-1.txt in text box. After
this, please select Classic Scheme or Balance
of Correlation for your QSPR/QSAR investigation
19
It appears in your screen
1
2
Two actions (1) define Method and (2)Save method
20
It appears in your screen
1
2
You can involve graph invariants in addition to
SMILES attributes
21
It appears in your screen
You can use classic scheme, balance of
correlations, and Ideal slopes C1,C1
22
It appears in your screen
3
1
1
2
You can choice your mode e.g. (1) Define
Dstart0.25 (2) Nepoch20 after this you must
do (3) Click Save method, otherwise method
remains the same
23
It appears in your screen
Click Search for preferable model (T,N)
24
It appears in your screen
Programm will carry out the Monte Carlo
optimization with various threshold and the
number of epochs. The preferable values of
threshold and the number of epochs one can find
in file Search/BestMDL.txt when the
calculation will be completed.
25
The containing of file search/BestMDL.txt will
be approximately the following
One can see that preferable threshold (T) is 2,
and the preferable number of epochs (N) is
15. One can use this information to build up
robust model.
26
An attempt to build up robust model
  • Create Folder MyCORALSEA-T2-N15 (copy of
    MyCORALSEA)
  • Run CORALSEA.exe in this folder
    MyCORALSEA-T2-N15
  • Click Load method

27
It appears in your screen
2
4

3
1
T2
N15
  • Insert Nepoch15,
  • (2) Click Building up preferable model (T,N)

(3)Insert Threshold2, and (4) Click
Continue
28
It appears in your screen
Click Yes
29
Gradually the program will be calculating the
model

30
When the model will be ready the screen will be
the following
Click Save system
31
Folder Model contains parameters of the
QSPR/QSAR model
File Output-1.txt contains statistics for the
invisible validation set
32
When the model will be ready the screen will be
the following
Click Load system
33
It will appear at the screen
2
MyInput.txt
1
  • Insert name MyInput.txt instead of
    Input-1.txt
  • (2) Click Start of DCW and Endpoint calculation
    for SMILES input file

34
It will appear at the screen
After these actions, file model/Output.txt
will contain results of calculation for compounds
from MyInput.txt Click OK
35
It will appear at the screen
You will see graphical representation for
sub-training, calibration, test, and validation
sets.
36
The containing of the model/Output.txt will be
the following
Last, but not least
37
One can calculate model for individual SMILES
1
2
  • Insert SMILES in indicated box
  • (2) Click Start of DCW and Endpoint Calculation
    for Inserted SMILES

38
It appears in your screen
See file Model/DemoDesc.txt
39
The Containing of Model/DemoDesc.txt is the
following
DCW is DCW(2,15) for NC(CCCNC(N)N)C(O)O
Endpoint2.9412. This example is only demo, the
NC(CCCNC(N)N)C(O)O is apparently out of Domain
of applicability.
40
These slides have shown the "technology", but to
understand "philosophy", please read file
"ReadMe.pdf"
41
Some definitions
Thank you for your attention ! CORALSEA TEAM
Write a Comment
User Comments (0)
About PowerShow.com