Dr. Rosemary Renaut, renaut@asu.edu Director, Computational Biosciences - PowerPoint PPT Presentation

About This Presentation
Title:

Dr. Rosemary Renaut, renaut@asu.edu Director, Computational Biosciences

Description:

Dr. Rosemary Renaut, renaut_at_asu.edu Director, Computational Biosciences http://math.asu.edu/~cbs/ 02/23/2006 * ATLANTA – PowerPoint PPT presentation

Number of Views:167
Avg rating:3.0/5.0
Slides: 37
Provided by: asue1
Learn more at: https://math.la.asu.edu
Category:

less

Transcript and Presenter's Notes

Title: Dr. Rosemary Renaut, renaut@asu.edu Director, Computational Biosciences


1
  • Dr. Rosemary Renaut, renaut_at_asu.edu Director,
    Computational Biosciences
  • http//math.asu.edu/cbs/

02/23/2006
1
ATLANTA
2
  • A Professional Science Masters Program
  • Mathematics and Statistics
  • The School of Life Sciences
  • Computer Science Engineering
  • W. P Carey Schoool Of Business

3
  • OUTLINE
  • THE CBS PROGRAM AT ASU OVERVIEW
  • CBS CURRICULUM
  • REQUIREMENTS
  • SOME HISTORY
  • FUTURE
  • PROJECTS WHAT DO THEY INVOLVE
  • OUR CASE STUDIES COURSE(S)
  • INTRODUCING BASIC MATHEMATICS TO LS/CSE STUDENTS

3
02/23/2006
ATLANTA
4
4
ATLANTA
02/23/2006
5
CORE REQUIREMENTS (30 hours)
  • Scientific Computing for Biosciences (4)
  • Case Studies/ Projects in Biosciences(4)
  • Structural and Molecular Biology (4)
  • Statistics and Experimental Design(6)
  • Business Practice and Ethics(6)
  • Internship and Applied Project(6)

5
ATLANTA
02/23/2006
6
ELECTIVE TRACKS (12 hours)
  • Genomics/Proteomics
  • Data Mining Data Bases,
  • Medical Imaging
  • Molecular/Functional Genomics
  • Microarray Analysis
  • Individualized

6
02/23/2006
ATLANTA
7
PRE-REQUISITES
  • Calculus and Differential Equations
  • Basic Statistics (junior)
  • Discrete Algorithms and Data Structures
  • Programming skills(C/Java)
  • Cell biology, genetics(junior level)
  • Organic and Bio Chemistry (junior)
  • Motivation, creativity, determination!

7
02/23/2006
ATLANTA
8
  • Interdisciplinary Training/Team Work
  • Internship/Applied Project Report
  • Business, Management and Ethics
  • (Health Services Administration MBA)
  • Small Groups/Close Faculty Involvement
  • Computer Laboratory
  • Extensive Project work/Consulting

8
ATLANTA
02/23/2006
9
  • Internships Requirements
  1. Internship is at least 400 hours , possible full
    time summer
  2. Student must write a project report with required
    format
  3. Student presents report to committee in oral exam
  4. International students can work off campus using
    EIP program
  5. Encourage students to seek projects outside AZ

10
DATA
  • Year 4 total 74 students, currently 30
  • Graduates 33 (11 left without graduating)
  • Internships NIH, ASU, Tgen, AZ Game and Fish,
  • US Water conservation lab, AZ biodesign
  • Jobs Tgen, ASU, Codon Solutions, Medical
    record keeping, Matlab, St Judes Memphis, Walt
    Disney! Cisco, Google (shortly arriving in
    Tempe!!),Ingenuity
  • AZ Game and Fish
  • PhD programs (10) Biology, Computer Science,
    Biochemistry (France, UK and ASU)

10
ATLANTA
02/23/2006
11
OTHER DEVELOPMENTS
  • Undergraduate NIH MARC
  • Calculus for Life Sciences (sophomore)
  • Quantitative Skills (sophomore)
  • Modeling Comp Bio (Junior)
  • PhD Program Computational Biosciences
  • Molecular Cellular Biology / Mathematics

11
02/23/2006
ATLANTA
12
WHAT DO WE DO SPRING 2004
  • Database Construction/Mining of Pathology
    Specimens (Tgen)
  • Gegenbauer high resolution reconstruction for
    MRI, ASU
  • TLS-SVM for Feature Extraction of Microarray
    Data, ASU
  • Automated video analysis for cell behavior. Tgen
  • EST DB for Marine Dinoflagellate Crypthecodinium
    cohnii, ASU
  • Data mining for microsatellites in ESTS from
    arabidopsis thaliana and brassica species (US
    Water Conservation Laboratory)
  • The Genome Assembler- Tgen
  • A user interface to support navigation for
    scientific discovery ASU
  • Cell Migration Software Tool Tgen

12
02/23/2006
ATLANTA
13
WHAT DO WE DO SPRING 2005
  • EVALUATION OF BIOINFORMATICS RESOURCES
    (Tgen/NIH/ASU)
  • Pattern recognition Automated Cytoskeleton
    Reconstruction (ASU)
  • Develop workable database on crop Lesquerella
    using Integrated Crop Information Systems (ICIS)
    (US Water Conservation Laboratory)
  • Investigation of Xylella fastidiosa Within an
    Almond Tree Population A Model System for Golden
    Death ( ASU Mathecology, AZ)
  • Search for Epigenetic Properties of DNA and RNA
    involved in X Chromosome Inactivation , (Codon
    Solutions LLC)

13
02/23/2006
ATLANTA
14
  • WHAT DID WE NEED FOR THESE PROJECTS

Image Analysis Data Mining Fourier
Analysis Modeling Differential Equations Sequence
Comparisons Mathematics for Genetic Analysis
Statistics Data base development for
BIOLOGICAL APPLICATIONS Geographic Information
Systems PERL/BIOPERL/MATLAB/MYSQL
14
02/23/2006
ATLANTA
15
  • Bioinformatics Managing Scientific Data tackles
    this challenge head-on by discussing the current
    approaches and variety of systems available to
    help bioinformaticians with this increasingly
    complex issue. The heart of the book lies in the
    collaboration efforts of eight distinct
    bioinformatics teams that describe their own
    unique approaches to data integration and
    interoperability. Each system receives its own
    chapter where the lead contributors provide
    precious insight into the specific problems being
    addressed by the system, why the particular
    architecture was chosen, and details on the
    system's strengths and weaknesses. In closing,
    the editors provide important criteria for
    evaluating these systems that bioinformatics
    professionals will find valuable.

15
02/23/2006
ATLANTA
16
0
Column 1
Column 2
Column 3
  • Computational Modeling Skills Motivated by Case
    Studies
  • Phylogenetics and Tree Building (for the data
    make the tree)

Human(A) Chimp(B) Gorilla(C) Orang-Utan(D) Gibbon(E)
Human(A) - .09190 .1082 .1790 .2057
Chimp(B) .0919/.0821 - .1134 .1940 .2168
Gorilla(C) .1057/.1083 .1161/.1330 - .1882 .2170
Orang-Utan(D) .1806/.1838 .1910.1838 .1895/.1838 - .2172
Gibbon(E) .2067/.2142 .2171/.2142 .2156/.2142 .2172/.2142 -
16
02/23/2006
ATLANTA
17
All additive trees with 5 branches which is the
correct one?
17
02/23/2006
ATLANTA
18
Repeat for all trees Use matlab Understand Least
Squares Nonnegative constraints Constrained
LS Exhaustive search Genetic Algorithms
For this tree we can calculate the patristic
distances between sequences pBDe2e6e7e4 T
his should match the distance from the measured
data We do a goodness of fit for all distances
p-d2 Ae d 2 What is A, what is e? Any
conditions on e?
18
02/23/2006
ATLANTA
19
0
Column 1
Column 2
Column 3
  • Computational Modeling Skills Motivated by Case
    Studies
  • Phylogenetics and Tree Building (for the data
    make the tree)

Human(A) Chimp(B) Gorilla(C) Orang-Utan(D) Gibbon(E)
Human(A) - 79 92 144 162
Chimp(B) 79 - 95 154 169
Gorilla(C) 92 96.7 - 150 169
Orang-Utan(D) 149.3 154 152.1 - 169
Gibbon(E) 166.2 170.9 169 169 -
19
02/23/2006
ATLANTA
20
An ultrametric tree what are the distances
ei? Solve the linear programming problem min
L(e) min ? ei, where this is the total length
of the tree. Moreover each length is positive,
and the total lengths are preserved eg e1e2, and
e4e8e1e6e7
LP problem with constraints max cTx with Axb x
0 Students identify x, c, b, A? Use matlab
linprog
20
02/23/2006
ATLANTA
21
BUT THERE AREMANY DIFFERENT TREE SHAPESAND
WHICH IS CORRECT? WE NEED EXHAUSTIVE
SEARCH GENETIC ALGORITHMS?
21
02/23/2006
ATLANTA
22
HOW WAS THIS USEFUL?
  • Introduction to
  • data fitting,
  • optimization, genetic algorithms, exhaustive
    search
  • matlab routines,
  • Realistic solutions (positive branch lengths)
  • Start on some multivariable calculus to derive
    normal equations

OTHER APPLICATIONS USING SIMILAR TECHNIQUES
Neural networks for classification how do they
learn? Data mining k-means clustering
minimize energy Gradient Descent
22
ATLANTA
02/23/2006
23
Clustering has recently been demonstrated to be
an important preprocessing step prior to
parametric estimation from dynamic PET images.
Clustering, as a form of segmentation, is useful
in improving the accuracy of voxel level
quantification in PET images. Classical
clustering algorithms such as hierarchical
clustering and K-means clustering can be applied
to dynamic PET data using an appropriate
weighting technique. New variants of hierarchical
clustering with different preprocessing criteria
were developed by Dr. Guo recently. Our research
focus is to validate these different algorithms
with respect to their efficiency and accuracy.
Different inter and intra cluster measures and
statistical tests are considered to assess the
quality of the different cluster results.
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
Otolith Aging and AnalysisWilliam T.
StewartAdvisors
Dr. Rosemary Renaut Dr. Paul MarshArizona State
University
Scott Byan Kirk Young Marianne MedingArizona
Game and Fish Department
  • Otoliths, also known as earstones are paired
    calcified structures used for balance and hearing
    in teleost fish. An otolith is acellular and
    metabolically inert providing biologists with a
    record of exposure to both the temperature and
    composition of the ambient water. Otoliths
    provide an abundance of information ranging from
    temperature history, detection of anadromy,
    determination of migration pathways, stock
    identification, use as a natural tag, and most
    importantly age validation. Growth rings
    (annuli) on the otolith record the age and growth
    of a fish from birth to death. With the use of
    Matlab the goal of this project is to design a
    program that uses digital otolith images to
    semi-automate the aging process.  There are three
    main components to this task.

28
(No Transcript)
29
New technology allows hundreds of pathology
specimens from human diseases to be sampled as
.6mm punches of tissues that are arrayed into new
TMA paraffin blocks these blocks are then
sectioned with microtomes to produce hundreds of
slides containing hundreds of human tissue
specimens (tissue microarrays, TMAs). Databases
to support analysis of these high throughput TMAs
will include information on diagnosis, treatment,
disease response, and multiple images from
follow-on studies linked to the coordinates of
each of the hundreds of punches on the TMA. Data
mining from the results of TMA experiments will
allow text mining and image feature extraction.
In this project, we present the requirements,
design, and a prototype of a web based TMA
database application.
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
Sequencing a Microbial GenomeMaulik Shah
Advisors Dr. Jeffrey Touchman Dr.
Rosemary Renaut Dr. Phillip
Stafford
  • Although many genomes are available for download
    today, the underlying technologies should not be
    taken for granted. By using shotgun sequencing
    techniques and a gauntlet of informatics, we are
    able to produce high-quality DNA sequence. We
    will first look at some of the robotics and
    chemistries of preparing DNA as samples for the
    sequencing instruments. Then we will look at the
    series of applications used in taking raw data
    signals, converting them to sequence and then
    finally assembling the data into a single genome.
    Highlighted will be some of the techniques used
    to speed the informatics processes as well as
    some of the challenges that informatics faces in
    processing data and assembling the genome.

34
Supertree Analysis of the Plant Family
Fabaceae Tiffany J. Morris Advisor Martin F.
Wojciechowski School of Life Sciences, Arizona
State University
The Tree-of-Life is a national and
international project to collect information
about the origin, evolution, and diversity of
organisms, with the goal of producing a tree of
all life on Earth (Pennisi, 2003). The obstacles
to achieving this goal are many. From questions
related to the kinds and number of data to be
used, to building that phylogeny, to the
methodological and computational resources
required to analyze the massive amounts of data
expected to be necessary to bring this to
fruition. The goal of this project is to obtain a
Supertree for the plant family Fabaceae utilizing
phylogenetic trees found in previously published
studies.
35
PROTEIN INTERACTION MAPPING USE OF OSPREY TO MAP
SURVIVAL OF MOTOR NEURON PROTEIN INTERACTIONS by
Margaret BarnhartAdvisor Dr. Ron Nieman
  • Spinal Muscular Atrophy is one of the leading
    genetic causes of death in infants. In humans,
    the disease state is characterized by homozygous
    deletion of the telomeric copy of the survival of
    motor neuron gene (SMN1). The centromeric copy,
    SMN2, rescues lethality by producing a small
    amount of full-length SMN protein as its minor
    product. The SMN gene was first characterized in
    1995, and research efforts to describe the
    molecular mechanisms of SMN protein in the cell
    have since revealed a highly complex set of
    functions and interactions for SMN. The large
    amount of protein-protein interaction data
    collected for SMN exceeds the limitations imposed
    by current methods of interaction

visualization. Osprey allows a network
representation of protein-protein interactions
and has been used to describe the recorded sets
of interactions of SMN. This method of
interaction visualization allows relationships to
be drawn between the functions of SMN and
analogous proteins, clustering of interactions
based on level of interaction or function, and
ultimately, the derivation of clues to the
critical function of SMN.
36
  • More Information please contact renaut_at_asu.edu
  • More information on projects www.asu.edu/compbio
    sci
Write a Comment
User Comments (0)
About PowerShow.com