Szilrd Drnt - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Szilrd Drnt

Description:

Chemical Hashed Fingerprints encode structural patterns in bit strings ... Contains Fingerprints for screening and ChemAxon Extended SMILES for ABAS ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 28
Provided by: chem2
Category:

less

Transcript and Presenter's Notes

Title: Szilrd Drnt


1
Scientific technical presentationJChem
Base
  • Szilárd Dóránt

Feb 2008
2
Contents
  • Introduction
  • Structural overview
  • Compatibility
  • Administration
  • JChem tables
  • Fingerprints
  • Structural search
  • Structure cache
  • Standardization
  • Search options
  • JSP example
  • API examples
  • Performance
  • Future plans

3
Introduction to JChem Base
High performance Java based tools for -
storage/ - search / - retrieval of chemical
structures and associated data The components
can be integrated into web-based or standalone
applications in association with other ChemAxon
tools.
4
Structural overview
Web browser
Application
Web application
JChem Base API Chemical logic Structure cache
JDBC driver Standard interface to the RDBMS
RDBMS (e.g. Oracle, MySQL, etc.) Storage and
security
5
Compatibility and integration
6
Administration with JChemManager
7
The property table
  • The property table stores information about JChem
    structure tables, including
  • Fingerprint parameters
  • Custom standardization rules
  • Other table options and information
  • Database-related licence keys
  • More than one property table can be used, each
    property table represents a particular JChem
    environment.

8
The structure of JChem tables
9
Table types
  • Molecules specific structures e.g. single
    molecules, mixtures, salts, polimers
  • Reactions single step reactions
  • Any structures all types of structures are
    allowed, but no structure type-specific searching

10
Chemical Hashed Fingerprints
  • Chemical Hashed Fingerprints encode structural
    patterns in bit strings
  • If structure A is a substructure of structure B,
    every bit in Bs fingerprint will be set that is
    set in structure As fingerprint
  • Tanimoto similarity of hashed fingerprints can be
    used for diversity analysis and similarity search

11
Structural search in database
  • Two stage method provides optimal performance
  • Rapid pre-screening reduces the number
    of possible hit candidates
  • Chemical Hashed Fingerprints are used
    for substructure and superstructure searches
  • Hash code is used for duplicate
    filtering (usually during compound
    registration)
  • Graph search algorithm is used to determine the
    final hit list

12
Structure Cache
  • Contains Fingerprints for screening and ChemAxon
    Extended SMILES for ABAS
  • Instant access to the structures for the search
    process
  • Reduced load on the database server
  • Incremental update ensures minimum overhead after
    changes in the table
  • Small memory footprint due to
  • SMILES compression
  • Optimized storage technique
  • Approximately 100MB memory needed for 1 million
    typical drug-like structures (using default, 512
    bit long fingerprints)

13
Standardization
  • Default standardization includes
  • Hydrogen removal
  • Aromatization
  • Custom standardization can be specified for each
    table by specifying an XML configuration file at
    table creation or in the Regenerate dialog of
    JChem Manager (jcman)

14
Custom Standardization Example
JChem Cartridge http//www.chemaxon.com/JChem_Cart
ridge.ppt
15
Database search options
16
JSP example application
  • Open source, customizable
  • Features
  • Substructure, Superstructure, Exact and
    Similarity search
  • Molecular Descriptor similarity search with
    descriptor coloring
  • Substructure hit alignment and coloring, inverse
    hit list
  • Chemical Terms filter
  • Import / Export
  • Export of hits
  • Insert / Modify / Delete structures

17
API example connecting to a database
ConnectionHandler ch new chemaxon.jchem.db.Conne
ctionHandler() ch.setDriver(oracle.jdbc.driver.
OracleDriver) ch.setUrl(jdbcoraclethin_at_local
host1521mydb) ch.setPropertyTable(JChemProper
ties) ch.setLoginName(scott) ch.setPassword("
tiger") ch.connect() // the java.sql.Connection
object is available if needed Connection
conch.getConnection() // closing the
connection ch.close()
18
API example database import
Importer importer new chemaxon.jchem.db.Importer
() importer.setConnectionHandler(conh) importer.
setInput(sample.sdf) // importer.setInput(is)
// alternatively a stream can also be
specified importer.setTableName(SCOTT.STRUCTURES
)
importer.setHaltOnError(false) importer.setDupli
cateImportAllowed(false) //can filter
duplicates // specifying SDFile field - table
field pairs String fieldPairs
DB_Field1SDF_Field1 DB_Field2SDF_Field2 impo
rter.setFieldConnections(fieldPairs) int
importedCount importer.importMols() System.out.
println( Imported importedCount
structures )
19
API example database export
Exporter exporter new chemaxon.jchem.db.Exporter
() exporter.setConnectionHandler(conh) exporter
.setTableName(structures) //data fields to
be exported with the structure exporter.setFieldL
ist(cd_id cd_formula name comments) String
fileNameoutput.sdf OutputStream osnew
FileOutputStream(fileName) exporter.setOutputStre
am(os) exporter.setFormat(sdf)
int exportedCount
exporter.writeAll() System.out.println(Exported
exportedCount structures)
20
API example database search
JChemSearch searcher new chemaxon.jchem.db.JChem
Search() searcher.setConnectionHandler(ch) searc
her.setSearchType(JChemSearch.SUBSTRUCTURE) search
er.setQueryStructure(c1ccccc1) searcher.setStru
ctureTable(SCOTT.STRUCTURES) // a query that
returns cd_id values can be used for
prefiltering Searcher.setFilterQuery( SELECT
cd_id FROM structures, biodata WHERE
structures.cd_id biodata.cd_id AND
biodata.toxicity esult(true) // otherwise runs in a separate
thread searcher.run() // getting the results as
cd_id values int resultssearcher.getResults()
21
API example inserting a structure
// ConnectionHandler, mode, table name and data
field names UpdateHandler uh new
chemaxon.jchem.db.UpdateHandler( ch,
UpdateHandler.INSERT, structures, comment,
stock) uh.setStructure(c1ccccc1) // the
structure // specifying data field
values uh.setValueForAdditionalColumn(1, some
text) uh.setValueForAdditionalColumn(2, new
Double(8.5)) uh.setDuplicateFiltering(true)
// filtering duplicate structures int
iduh.execute(true) // getting back the
cd_id of the inserted structure if ( id 0 )
System.out.println(Inserted, cd_id value
id) else System.out.println(Already
exists with cd_id value (-id)) //
storing update information, the database
connection remains open uh.close()
22
Performance (1)
  • Compound registration
  • Substructure search in a table of 3 million
    compounds
  • JChem Base 3.2, Dual Xeon 3GHz, 2GB RAM Oracle
    9.2.0.7.0

23
Performance (2)
Similarity searchTanimoto 0.9 JChem
Base 3.2, Dual Xeon 3GHz, 2GB RAM Oracle
9.2.0.7.0
24
Future plans
  • Additional layer JChem Server (later also as
    grid)
  • Tables for storing query structures
  • Tables for storing general (Markush) structures
  • Partial clean option for hit alignment

25
Summary
  • ChemAxons JChem Base API provides sophisticated
    high performance tools for the developer to deal
    with chemical structures and associated data.
  • Building on the JChem API is convenient,
    because
  • Our various tools integrate seamlessly
  • Both high and low level API classes are available
  • Responsive developer-to-developer support

26
Links
  • JChem home page
  • www.jchem.com
  • Live demos
  • www.jchem.com/examples
  • API documentation
  • www.jchem.com/doc/api
  • Brochure
  • www.chemaxon.com/brochures/JChemBase.pdf

27
Visit other technical presentations
MarvinSketch/View http//www.chemaxon.com/MarvinSk
etch_View.ppt MarvinSpace http//www.chemaxon.com
/MarvinSpace.ppt Calculator Plugins
http//www.chemaxon.com/Calculator_Plugins.ppt J
Chem Base http//www.chemaxon.com/JChem_Base.ppt
JChem Cartridge http//www.chemaxon.com/JChem_Cart
ridge.ppt Standardizer http//www.chemaxon.com/St
andardizer.ppt Screen http//www.chemaxon.com/S
creen.ppt JKlustor http//www.chemaxon.com/JKlust
or.ppt Fragmenter http//www.chemaxon.com/Fragmen
ter.ppt Reactor http//www.chemaxon.com/Reactor.
ppt
Write a Comment
User Comments (0)
About PowerShow.com