Introduction to Computational Biosciences and Bioinformatics - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Introduction to Computational Biosciences and Bioinformatics

Description:

... and data visualization/ analysis tools such as Spotfire Solid understanding of relational databases and familiarity with Oracle and/ or SQL server Good ... – PowerPoint PPT presentation

Number of Views:367
Avg rating:3.0/5.0
Slides: 36
Provided by: ROPE8
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Computational Biosciences and Bioinformatics


1
Introduction to Computational Biosciences and
Bioinformatics
  • Alex Ropelewski
  • ropelews_at_psc.edu
  • Pittsburgh Supercomputing Center
  • National Resource for Biomedical Supercomputing
  • http//staff.psc.edu/ropelews/jsu/Begin_CS_Jackson
    _State_Intro_Computational_BioScience.ppt
  • http//compbio.jsums.edu/awareness/week1.html

2
Computational Biosciences
  • The application of
  • computer science, engineering,
  • physical science and mathematics
  • to the way in which plants, animals and humans
    function

3
Computational Bioscience Fields
  • Bioinformatics
  • Structural biology
  • Genetic databases
  • Quantitative ecology
  • Physiological modeling
  • Medical informatics
  • Image processing and visualization
  • Medical imaging
  • Biomedical instrumentation
  • Biomathematics
  • Neuroscience
  • Telemedicine
  • Biomedical engineering
  • Other related areas

4
Bioinformatics
  • The interdisciplinary science of using
    computational approaches to analyze, classify,
    collect, represent and store biological data with
    the goal of accelerating and enhancing the
    understanding of DNA, RNA and Protein sequences.

5
Structural Biology
The branch of the sciences concerned with the
molecular structure of biological macromolecules
such as proteins and nucleic acids, how they
acquire the structures they have, and how
alterations in their structures affect their
function.
6
Physiological Modeling
The study of the mechanical, physical, and
biochemical functions of living organisms through
the use and creation of mathematical models of
physiological systems. Examples include models of
components of organisms, such as particular
organs or cell systems.
7
Image Processing and Visualization
The science of organizing, displaying, and
analyzing image data taken from any living
organism in a realistic life-like manner.
8
Computational Neuroscience and Signal Processing
Applying mathematical and computational methods
to understand the signaling, control and other
networks in living organisms
9
Who Employs Computational Bioscientists?
  • Pharmaceuticals Biotechnology (Bayer,
    Schering-Plough, Amgen, Merck, Eli Lilly, etc,)
  • Hospitals (particularly research hospitals)
  • Agriculture (Monsanto, Pioneer, etc.)
  • Academia (particularly research
    universities/institutes)
  • Government
  • NIH (many institutes including NLM, NCBI, NCI,
    CDC)
  • DOE (National labs)
  • Department of Defense (including Army Corps of
    Engineers)
  • Agriculture, Veterans Affairs, NSF
  • Government Contractors (such as Computercraft,
    SRA)

10
Computational Biosciences Job Growth
Engineers, Life and Physical Scientists and
Related Occupations. Occupational Outlook
Handbook, 2008-09 Edition. Department of Labor,
Bureau of Labor Statistics
11
Computational Biosciences Salaries
National Occupational Employment and Wage
Estimates Department of Labor, Bureau of Labor
Statistics, May 2007
12
Computational Biosciences
  • Interdisciplinary skills are required
  • Require knowledge in the following areas
  • Biology
  • Chemistry
  • Computer Science
  • Mathematics
  • Statistics
  • Physics
  • Engineering

13
Computational Biosciences Required Skill Sets
  • Agricultural and food scientists need the
    ability to apply statistical techniques, and the
    ability to use computers to analyze data and to
    control biological and chemical processing.
  • Biological scientists usually study allied
    disciplines such as mathematics, physics,
    engineering and computer science. Computer
    courses are beneficial for modeling and
    simulating biological processes, operating some
    laboratory equipment and performing research in
    the emerging field of bioinformatics
  • Computer skills are essential for prospective
    environmental scientists and hydrologists.
    Students who have some experience with computer
    modeling, data analysis and integration, digital
    mapping, remote sensing and Geographic
    Information Systems will be the most prepared to
    enter the job market
  • Medical scientists in addition to required
    courses in chemistry and biology undergraduates
    should study allied disciplines such as
    mathematics, engineering, physics, and computer
    science

Engineers, Life and Physical Scientists and
Related Occupations. Occupational Outlook
Handbook, 2008-09 Edition. Department of Labor,
Bureau of Labor Statistics
14
Computational Biosciences Required Skill Sets
  • Developments in the field of Chemistry that
    involve life sciences will expand, resulting in
    more interaction among biologists, engineers,
    computer specialists and chemist. Chemistry
    majors usually study biological sciences
    mathematics physics and increasingly computer
    science. Computer courses are essential because
    employers prefer job applicants who are able to
    apply computer skills to modeling and simulation
    tasks and operate computerized laboratory
    equipment. This is increasingly important as
    combinatorial chemistry and advanced screening
    techniques are more widely applied. Courses in
    statistics are useful because chemists need the
    ability to apply basic statistical techniques.
    Chemists should experience employment growth in
    pharmaceutical and biotechnology research as
    recent advances in genetics open new avenues of
    treatment for diseases. Job growth for chemists
    is expected to be strongest in pharmaceutical and
    biotechnology firms.

Engineers, Life and Physical Scientists and
Related Occupations. Occupational Outlook
Handbook, 2008-09 Edition. Department of Labor,
Bureau of Labor Statistics
15
Bioinformatics
  • The interdisciplinary science of using
    computational approaches to analyze, classify,
    collect, represent and store biological data with
    the goal of accelerating and enhancing the
    understanding of DNA, RNA and Protein sequences.

16
What is a Sequence?
  • A sequence is a way to represent a protein, DNA,
    or RNA molecule as a character string.

Phospholipase A2 - Bos taurus (Bovine).
MRLLVLAALLTVGAGQAGLNSRALWQFNGMIKCKIPSSEPLLDFNNYGCY
CGLGGSGTPV DDLDRCCQTHDNCYKQAKKLDSCKVLVDNPYTNNYSYSC
SNNEITCSSENNACEAFICNC DRNAAICFSKVPYNKEHKNLDKKNC
17
Molecular Alphabet
  • DNA/RNA Sequences Letters represent side chains
    or bases
  • A - Adenine
  • C - Cytosine
  • G - Guanine
  • T - Thymine (DNA)
  • U - Uracil (RNA)
  • X or N (Unknown)

Image from Wikipedia Commons http//en.wikipedia.
org/wiki/FileDNA_chemical_structure.svg
18
Molecular Alphabet
  • Protein Sequences Letters represent amino
    acids
  • A - Alanine
  • R - Arginine
  • N - Asparagine
  • D - Aspartic acid
  • C - Cysteine
  • E - Glutamic acid
  • Q - Glutamine
  • G - Glycine
  • H - Histidine
  • I - Isoleucine
  • L - Leucine
  • K Lysine
  • M Methionine
  • F - Phenylalanine
  • P - Proline
  • S - Serine
  • T - Threonine
  • W - Tryptophan
  • Y - Tyrosine

N
Q
P
G
I
C
L
C
Y
Image from Wikipedia Commons http//en.wikipedia.
org/wiki/FileOxytocin.jpg
19
What is an Information Library?
  • A compilation of prior experimental knowledge
    about biologically relevant molecules into a
    computer system.
  • Bioinformatics power is in the ability to
    leverage and apply this prior experimental
    knowledge to additional biological problems.
  • In order to effectively search prior experimental
    knowledge, the prior experimental knowledge must
    be organized in a way that makes sense from both
    a computer science prospective and a biological
    point of view.

20
How is Information Organized?
  • From a computer-science perspective, there are
    several ways that data can be organized and
    stored
  • In a relational database
  • In a flat file
  • In a networked (hyperlinked) model
  • From a biologists perspective, there are also
    several different ways that data can be organized
  • Sequence
  • Structure
  • Family/Domain
  • Species
  • Taxonomy
  • Function/Pathway
  • Disease/Variation
  • Publication Journal
  • And many other ways

21
Representing Biological Data
  • Sequence Libraries
  • Character based
  • Classification Libraries (Aligned sets of
    sequences)
  • Ambiguous consensus patterns
  • Weight Matrix
  • Position Specific Scoring Matrix (Profile)
  • Hidden Markov Models
  • Structural Libraries
  • X,Y,Z coordinates for each alpha carbon atom
  • Taxonomy
  • Tree structure represents the taxonomic lineage

22
What does a biologist do with this data?
  • Search for similar sequences (sequences that
    share a biological relationship)

23
What does a biologist do with this data?
  • Search for similar sequences (sequences that
    share a biological relationship)

24
What does a biologist do with this data?
  • Align groups of sequences that share a biological
    relationship (family)

25
What does a biologist do with this data?
  • Understand phylogenetic relationships of the
    family.

26
What does a biologist do with this data?
  • Understand key positions (residues) of the family.

27
What does a biologist do with this data?
  • Understand how key positions affect the structure
    and function of the molecule being studied

28
What does a biologist do with this data?
  • Use structural data for a molecule from one
    species to model a related molecule from another
    species.

29
Job Opportunities in Bioinformatics
  • This course will teach you many essential skills
    that are asked for in these job postings.
  • Lets look at actual job postings asking for
    bioinformatics expertise
  • Not all jobs will be labeled bioinformatics or
    sequence analysis many are in a related
    computational bioscience field.
  • Specific skills required

30
Summer Internship-Computational Biology
  • QualificationsTo be eligible for a
    Computational Biology Summer Scientific
    Internship students will have completed their
    undergraduate Sophomore year (by June 2009)
  • Be majoring in a biological, chemistry or
    computer science program.
  • Candidates would have completed at least one
    programming course before the start of the
    internship.
  • All interns must have current authorization to
    work for any employer within the United States.
  • Experience with MatLab, SQL, C and/or PERL
    experience is desired.

http//jobview.monster.com/getjob.aspx?JobID78206
043JobTitleSummerInternship-ComputationalBiolo
gyqcomputationalbiologycyuslid316re0pg1
dv1AVSDM2008-12-18143a203a00seq2fseo1i
sjs1re1000
31
Bioinformatics Assembly Analyst
  • Responsibilities
  • assembling genome sequence data using a variety
    of tools and parameters and performing the
    experiments needed to evaluate sequencing
    strategies
  • using existing software and databases to analyze
    genomic data and correlating assemblies and
    sequences with a variety of genetic and physical
    maps and other biological information
  • identifying problems and serving as point of
    contact for various groups to propose and
    implement solutions
  • proposing and implementing upgrades to existing
    tools and processes to enhance analysis
    techniques and quality of results
  • developing and implementing scripts to
    manipulate, format, parse, analyze, and display
    genome sequence data and developing new
    strategies for analysis and presentation of
    results.
  • Requirements
  • a bachelor's degree in biology or related field
  • at least three years of experience in DNA
    sequencing and sequence analysis.
  • Must possess solid knowledge of sequencing
    software and public sequencing databases.
  • Knowledge of bioinformatics tools helpful.
  • http//sh.webhire.com/servlet/av/jd?ai631ji2285
    147snI

32
Bioinformatics Analyst
  • Responsibilities
  • The Bioinformatics Analyst will process sequence
    data and apply quality control measures for
    generating high quality raw sequence and
    assembled data from next generation sequencing
    technologies.?
  • Will perform whole genome alignments using
    existing alignment tools, including BLAST, mummer
    and patternhunter Perform mapping and
    post-mapping analysis with short reads using
    third-party and internally developed tools.?
  • Responsible for receiving, processing and
    managing sequence data.?
  • Evaluate new methodologies and tools and improve
    data processing and quality control protocols.?
  • Develop suitable metrics for reporting the
    completeness and quality of the sequence
    delivered to the customers.? 
  • Requirements
  • B.?S.? in biology, computer science,
    bioinformatics or related field, or equivalent
    combination of education and experience
  • A minimum of 2 years experience in genomics and
    bioinformatics-related work.?
  • Proficiency in Unix and experience in one or more
    of these programming languages -perl, SQL, jython
    and java is required.?
  • Familiar with the use of commonly-used sequence
    analysis tools and genomic databases
  • Willing to multi-task and respond to new
    challenges as required.?
  • Excellent communication skills.?
  • Hands-on experience in a research or production
    environment

http//jobview.monster.com/getjob.aspx?JobID78527
133JobTitleBioinformaticsAnalystbrd1qbioinf
ormaticscyuslid316re130AVSDM2009-01-0912
3a563a00pg1seq11fseo1isjs1re1000
33
Business Systems Analyst
  • Responsibilities
  • The ideal candidate should be a highly motivated
    team player with a strong understanding of
    informatics solutions to biology and chemistry,
    especially in the area of data visualization/?stat
    istical analysis and with proven record of
    building/?integrating effective tools for
    scientists to help them in their daily work.?
  • Actively work with scientists/?computational
    biologists in a disease area to understand their
    needs
  • Define proper data analysis solution(s) to meet
    their scientific needs
  • Perform rapid prototyping to refine the
    requirements with proper documentation
  • Work with internal and external software teams,
    where appropriate to design/?implement proper
    solutions to meet scientists' needs
  • Work either as a team member or lead a team to
    deliver data analysis platforms to
    scientists/?computational biologists
  • Work effectively with different NITAS groups to
    ensure a globally consistent implementation
    scheme.?
  • Requirements
  • Bachelor's degree in computer science, Biology,
    Bioinformatics or comparable qualification
  • At least 3-5 years hands-on experience on data
    analysis in a drug discovery, scientific or
    biotech environment
  • Strong communications and interpersonal skills
  • Proven capabilities interacting with scientists
    and being customer service oriented
  • Ability to work independently and/?or as part of
    a team
  • Familiarity with scientific LIMS such as
    ActivityBase, and data visualization/?analysis
    tools such as Spotfire
  • Solid understanding of relational databases and
    familiarity with Oracle and/?or SQL server
  • Good understanding in fundamentals of software
    engineering.?

34
Summary
  • Wide variety of jobs
  • Biology, especially molecular biology and
    genetics
  • Some statistics
  • Computer skills
  • UNIX
  • Bioinformatics Tools
  • Database (SQL)
  • Some Programming
  • Web
  • Bioinformatics can be a rewarding career path

35
National Resource for Biomedical Supercomputing
Write a Comment
User Comments (0)
About PowerShow.com