MB280 Introduction to Bioinformatics 3 credits Genes, machines and you' Learn the basics of analyzin - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

MB280 Introduction to Bioinformatics 3 credits Genes, machines and you' Learn the basics of analyzin

Description:

8/28/07 1) Class orientation, Introduction to computers and operating languages ... methodological approaches to studying biology. that are computer based. ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 25
Provided by: drmarcell
Category:

less

Transcript and Presenter's Notes

Title: MB280 Introduction to Bioinformatics 3 credits Genes, machines and you' Learn the basics of analyzin


1
MB280 Introduction to Bioinformatics (3
credits)Genes, machines and you. Learn the
basics of analyzing DNA and protein sequences
using state of the art computer software
Note This is an experimental class for graduate
students with limited space for undergraduates.
Last years class incorporating both student
groups was highly successful.
MB535-LabGenomic Analysis (4 credits)
Learn the basics of Bioinformatics database
searching, sequence analysis, phylogenetic
reconstruction and in silico research design.
Take advantage of biological databases that will
inform and direct your future research agenda.
  • Tuesday lecture 200-315
  • Thursday laboratory 200-500
  • INBRE Bioinformatics Lab
  • Cooley Lab B2
  • Professor Marcie McClure
  • mars_at_parvati.msu.montana.edu

2
What was cover last lecture?
A little bit about the machines. General
Concepts about how hardware and software
interact Basic definitions
3
  • Tentative syllabus.
  • 8/28/07 1) Class orientation, Introduction to
    computers and operating languages
  • 8/30/07 1st) Lab Assignment know the machines
    your monitor is accessing/using your own computer
  • i. Know how to access the Internetsend
    me an email
  • ii Do the Unix/Linux tutorials.
  • iii. Do a web search for the terms
    bioinformatics and computational biology.
  • iv. Create list of sites and types of
    methods necessary to do bioinformatics and
  • computational biology.
  • 9/4/07 2) What is Bioinformatics/Computational
    Biology
  • 9/6/07 2nd) Lab Assignment
  • i. Loading software on your own
    machine.
  • ii. The NCBI/EBI/PR sites, familiarize yourself
    with them.
  • iii. Do a conceptual translation of a nucleic
    acid sequence.
  • iv. Choose a gene family to work on for the rest
    of the semester.
  • 9/11/07 3) Database searching and pairwise
    alignments
  • 9/13/07 3rd Lab Assignment

4
Computational Biology is biology that cannot be
done without the intensive use of computers.
There are many domains in Computational Biology
Ecology Evolutionary Biology
Structural Biology
Bioinformatics
Physiology
McClure, 2000
5
Whats in a name?
Encodeome Expressome Interactome
Mobilome Retrome Oming is very
sexy !
Genomics--DNA/RNA Transcriptomics--RNA Proteomics-
-Proteins Phenomics--Proteins Operomics--NA/Protei
ns Biological Informatics In silico research
Computational Biology Bioinformatics Phrase of
the month!
McClure, 2000
6
What is Bioinformatics/Computational Biology?
These terms are used to describe technical and
methodological approaches to studying biology
that are computer based. The goal of this
research is the creation of new knowledge, or
meta-data, from existing primary data. This
type of research takes place in silico and
includes the development and testing of the
software tools necessary to analyze biological
data.
McClure, 2000
7
Multidisciplinary Nature of Bioinformatics/Comput
ational Biology
Biological Sciences
How much of which does who need to learn?
Computer/ Systems Science
Mathematics Statistics
McClure 2002
8
Opportunities in Bioinformatics/Computational
Biology
Training
Ph.D.
M.A.
B.A.
Service Providers Research Staff Principle
Investigators
Industry
Government
Academia
McClure 2002
9
Bioinformatics/Computational Biology
New Knowledge
Evolution
Structure
Function
McClure, 2000
10
The practice of Bioinformatics /Computational
Biology is an interplay between knowledge of
empirically derived data, bioinformatic tools
and human decision making. Exactly which
information and tools are to be accessed is
dependent on the nature of the question of
interest.
McClure, 2000
11
  • What is this class?
  • Bioinformatics/Computational Biology
  • A brief intro to computing concepts
  • Basic concepts
  • a) databases
  • b) search
  • c) align
  • d) structure/function methods
  • 3) How to use Bioinformatic methods
  • 4) Designing in silico experiments
  • 5) Interpretation of results

12
Central Dogma of Bioinformatics
  • i. databases
  • ii. searches
  • iii. sequence annotation/gene ontologies
  • iv. alignments
  • v. phylogenetics
  • vi. functional predictions
  • vii. structural predictions
  • gene expression pathways and micro array data
  • modeling
  • math and algorithms
  • programming

13
McClure, 2000
14
Why is critical reading of the literature
important? Why is reading the software manual
important?
McClure, 2000
15
Levels of Analysis of Primary Structure Data
COMPLETE GENOME (global relationships) 1)
universal versus unique genes 2) consensus
phylogenetic relationship 3) genome architecture
(deviation from tree-like behavior)
INDIVIDUAL GENES (local relationships) 1)
congruency of phylogenies for individual genes 2)
relative rates of change 3) recombination 4) gene
architecture 5) gene product structure and
interactions
INTRAGENIC (sub-local relationships) 1) rate of
change 2) recombination 3) motif analysis
McClure, 2000
16
Primary Structure the Sequence
Sequence Alignment
Phylogenetics gt70 id N.A. lt 70 id A.A.
2-D and 3-D Predictions
OSM lt 30 id A.A.
function
evolution
structure
McClure, 2001
17
(No Transcript)
18
From McClure, 1991
19
Ordered Series of Motifs for the Reverse
Transcriptase Name I II III
IV V VI   LINE ILIPKPGRD LMNIDAKIL
TGTRQGCP SLFADDMIVY RIKYLGIQL PCSWVGRIN LHERV WPV
QKTDGS YAAIDLANA TVLPQGYI VHYIDDIMLI SVKFLGSSG
HISYLGVLF EHERV LPVPKPGTK FTCLDLKDA TQLPQRFK
LQYVDDLLLG QVCYLGFTI VREFLGAVG FHERV ILPIKKPDG
FSVLDFKDF TILHQGFR LQHEDDLLLC KVSYLGLII
LLSFLGLVG WHERV LGVQKPNRQ FTVLDLQDA TILPQGFR
SVGVDDLLLA SQQYLGLKL LRGFLGVIG FRDHERV ILTVKKTNG
FSVLDFKNF TVLPQGFR LQYMDDLLIC AIQYLGIIM
FAFLGITR SHERV WPVRKPDGT HFVVDLANA TMLPQGYV
FHYIDDIMIL SAKLLGVIW FVGFLGYQ RHERV NLSGKKQYP
FTVLDLKDA TVLPQGFK LQYVDDLLIS TIEYLGFLL
LKGFLGMAG T47DHERV ILPVKKSDG FTVIDLKVD TVLPQGFT
LQYMDDLLIS EVKYLGHLI LRKFLGLVT KHERV FVIQKKSGK
LIIIDLKDC KVLPQGML IHCIDDILCA PFHYLGMQI
FQKLLGDIN IHERV ILPVKKSDG FTVIDLKDA TVLPQGFM
LQYVDDILIS KVKYLGRLI LRKFLGLVG HHERV LPVQKPDKS
YSVLDLKDG TVLPQGFR IQYIDELLLC SVTYLGIIL
LLSFLGMVG FMuLV LPVKKPGTN YTVLDLKDA TRLPQGFK
LQYVDDLLLA QVKYLGYLL LREFLGTAG HTLV1 FPVKKANGT
LQTIDLKDA RVLPQGFK LQYMDDILLA TIKFLGQII
LQALLGEIQ SRV2 FVIKKKSGK KIVIDLKDC KVLPQGMA
IHYMDDILIA PYTYLGFQI FQKLLGDIN Snakehead WPVGKPDG
S YSSLDISNG TRLPQGFH LQYVDDILLM QVQYLGVNV
LRSALGLFN Spuma YPVPKPDGR KTTLDLANG TRLPQGFL
QVYVDDIYLS TVEFLGFNI LQSILGLLN FIV FAIKKKSGK
VTVLDIGDA CSLPQGWI YQYMDDIYIG PYTWMGYEL
LQKLAGKIN HIV1 FAIKKKDST VTVLDVGDA NVLPQGWK
YQYMDDLYVG PFLWMGYEL IQKLVGKLN Dirs FTVPKPGTN
MVKLDIKKA KTMPFGLS IAYLDDLLIV SITFLGLQI
PRKLAGLKG Gypsy VLVPKKDGT FTTLDLHSG TVMPFGLV
NVYLDDILIF ETEFLGYSI AQRFLGMIN Caulimo KRRGKKRMV
FSSFDCKSG NVVPFGLK CVYVDDILVF KINFLGLEI
LQRFLGILT Badna EVAQKPRIV FSKFDLKAG NVCPFGIA
LLYIDDILIA EVEYLGVEI LQAYLGLLN HBV FLVDKNPHN
WLSLDVSAA RKIPMGVG FSYMDDVVLG SLNFMGYVI
IVGLLGFAA Copia WTITKRPEN KYQIDYEET MRLPQGIS
LLYVDDVVIA IKHFIGIRI CRSLIGCLM Intron VGGEKGPYS
TGRIDDQEN GLTPKTEF VRYADDLLLG TVEFPGMVI
KFRNLGNSI Retron TVEKKGPEK ILNIDLEDF NLLPQGAP
TRYADDLTLS QRKVTGLVI HHIFCGKSS PMAUP VYIPKANGK
FPSVDLAYL NGVPQGAS IMYADDGILC SVKFLGLEF
YIQVLGYLP Archaea IEIPKKSGG LLEFDIKGL KGTPQGGV
ERYADDSVIH KFDFLGYTF WVNYYGLFY HTERT RFIPKPDGL
FVKVDVTGA QGIPQGSI LRLVDDFLLV EDEALGGTA RRKLFGVLR
20
  • Structure predictions
  • Primary sequence
  • Secondary
  • Tertiary
  • Fold prediction
  • Homology modeling
  • Disorder prediction
  • Interaction predictions
  • Among and between proteins
  • Expression predictions

21
Predicting Interactions of the Replication/Transcr
iption Complex
Multiple Alignment
Experimental data regarding interactions of L. N
and P
N, P and L sequences
Evolutionary Dynamics Analysis
Predict regions of disorder
Inter-CM analysis
Phylogenetic reconstruction
ESF-analysis
Intra-CM analysis
Integration of Heterogeneous Data Sources in a
Bayesian Framework
Most Probable Amino Acid Contact Points
22
Search DatabasesSequence, Literature,
Structural Other??
Data Retrieve, Annotate, Manage
Determine Methodological Limitations
Analyze Data Multiple Alignment of
Sequences OSM/MIR Determination 2D and 3D
Modeling Phylogenetic Reconstruction Gene and
Genome Architecture Structural Determination
McClure, 2001
23
A) Experimental Types 1) analytical a)
datamining b) functional determination c)
structural determination d) hypothesis
generation and testing 2) technical a)
algorithm/software development b) comparative
testing of methods B) Data acquisition /- a
priori information 1) search methods and
limitation 2) databases C) Experimental
Design 1) literature/experimental knowledge 2)
data maskers/biological knowledge 3)
variables 4) controls D) Running programs and
collecting the resulting data 1) program
limitations 2) data manipulation E) Analyzing
data 1) numerical data 2) qualitative data F)
Data presentation
McClure, 2001
24
  • 9/6/07
  • 2nd) Lab Assignment
  • Loading software on your own machine.
  • The NCBI/EBI/PR sites, familiarize yourself with
    them.
  • Do a conceptual translation of a nucleic acid
    sequence.
  • iv. Choose a gene family to work on for the rest
    of the semester.
Write a Comment
User Comments (0)
About PowerShow.com