Analyzing cDNA microarray data using Python and the C clustering library: Why scripts are better than GUIs - PowerPoint PPT Presentation

1 / 7
About This Presentation
Title:

Analyzing cDNA microarray data using Python and the C clustering library: Why scripts are better than GUIs

Description:

The third Bioinformatics Open Source Conference, BOSC 2002. August 1-2, 2002, Edmonton, Canada ... heavily used in bioinformatics. Bioperl http://www.bioperl. ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 8
Provided by: mdeh2
Category:

less

Transcript and Presenter's Notes

Title: Analyzing cDNA microarray data using Python and the C clustering library: Why scripts are better than GUIs


1
Analyzing cDNA microarray data using Python and
the C clustering libraryWhy scripts are better
than GUIs
  • Michiel de Hoon, Seiya Imoto, Satoru Miyano
  • Human Genome Center, University of Tokyo

The third Bioinformatics Open Source Conference,
BOSC 2002 August 1-2, 2002, Edmonton, Canada
2
Scripting languages are already heavily used in
bioinformatics
  • Bioperl http//www.bioperl.org/
  • Biopython http//www.biopython.org/
  • Bioruby http//www.bioruby.org/
  • G-language http//www.g-language.org/ (uses
    Perl)

However, numerical analysis of cDNA microarray
data is still dominated by GUI-based codes
Why?
Because excellent GUI-based codes are available
for gene expression data analysis (such as
Cluster/TreeView by Michael Eisen, and
GeneCluster by Pablo Tamayo)
3
What scripting languages can do for you(Perl,
Python, Ruby, Tcl, )
  • Easier to write, less prone to bugs, ideal for
    developing new algorithms
  • Avoid checking your algorithm and chasing pointer
    errors at the same time
  • More flexible than GUIs
  • Allow batch processing
  • Can run on any platform (Windows, Cygwin,
    Macintosh, Unix)
  • Including supercomputers!
  • Often compiler-independent (unlike GUI-based
    code)
  • Makes open source software development easier
  • A large number of people have contributed to
    scripting languages ?
  • Software packages are often already
    available
  • File handling
  • Text parsing
  • Graphics
  • Numerics Data structures , algorithms, random
    number generators
  • Beats writing a C/Fortran code from scratch
  • One script can contain a complete data analysis
  • Downloading data from a data base, file parsing,
    numerical data analysis, drawing figures
    invaluable for replication (see the example on
    our website)

4
Scripting languages make code development easier
  • Write your new algorithm in Python
  • Test it
  • Improve the algorithm
  • Implement the numerically intensive routines in
    C
  • which can be called from Python
  • thus combining the speed of C with the
    flexibility of Python
  • If needed, the C routines can be used in other
    programs as well
  • so scripting languages can make development
  • of GUI-based codes easier too

repeat
5
An example The C clustering library
  • The C clustering library contains routines for
    commonly used clustering methods
  • hierarchical clustering pairwise single,
    maximum, centroid, and average linkage
  • k-means clustering
  • self-organizing maps on a 2D rectangular grid
  • principal component analysis
  • The C clustering library can be used in three
    ways
  • by calling routines in the library from other
    programs
  • as an extension module for Python
  • through the improved version of the GUI-code
    Cluster/TreeView
  • (which calls routines in the C clustering
    library)
  • All three are available from our website.

The C clustering library as a Python extension
module has been compiled successfully on Windows,
Linux, and Unix (SGI-Cray Origin2000) systems
using GNUs gcc. No commercial compiler is needed
even for Cluster/TreeView. The library was
released under the GNU Lesser General Public
License.
6
At http//bonsai.ims.u-tokyo.ac.jp, follow the
link to
7
At our poster, you will find more examples of
using the C clustering library
  • Downloading, analyzing, and clustering of gene
    expression data with Python
  • An implementation in Python of a bootstrap
    calculation of hierarchical clustering
  • Cluster/TreeView 3.0

How to find us Go to http//bonsai.ims.u-tokyo.ac
.jp, click on
Write a Comment
User Comments (0)
About PowerShow.com