Title: Perl%20Programming:%20Developing%20Key%20Tools%20for%20Bioinformatics
1Perl Programming Developing Key Tools for
Bioinformatics
An Informative Look Behind the Importance of
Programming Skills and Brief Tutorial on Getting
Started With Perl and Bioperl
Andrew C. Rieser 4-1-04
2Why Computer Skills are Important
Computers are powerful devices for understanding
any system that can be described in a
mathematical way (Gibas, 2001).
- At one time Computer Skills for Biologists
werent important - Mass Quantities of resources on the Internet
- Numerous tools to manipulate and discover
genomic info - Programming Skills can be very important
3Problem?
- The Rapid Growth of GenBank and Other Online
Databases - Different Formats that these sequences are
stored in
4Who Recognizes This Problem?
Cynthia Gibas an assistant professor in biology
at Virginia Tech James Tisdall
Consultant for Biocomputing Associates of
Kimberton, PA, was one of the first people to use
Perl in bioinformatics, and he is also the
developer of DNA WorkBench, a parallel-processing
bioinformatics Perl program used worldwide.
5WHY PROGRAM? WHY NOT USE THE READILY AVAILABLE
TOOLS?
http//biowb.sdsc.edu/register.cgi
6The majority of biological researchers are not
programmers
- Biologists often feel that basic computing skills
are all that they need to fulfill their everyday
research tasks. - In many cases Tisdall observes, you can
accomplish quite a bit using existing tools. - What happens when you want to do something a
preexisting tool doesnt do? What happens when
you cant find a tool to accomplish a particular
task, and you cant find someone to write it for
you? (Tisdall, 2001).
7BENEFITS OF PROGRAMMING
- Skill is in Strong Demand
- Makes You More Marketable as a Biologist
- Saves Time
- Easy Skill to Pick Up
The only chance biologists have of keeping up
with the job of analyzing data is by developing
libraries of reusable software tools.
Cynthia Gibas - 2001
8PERL BIOPERL
Common Languages Relevant To Bioinformatics
C/C, Python, and Perl
WHY PERL?
Perl has become a very popular bioinformatics
programming language because of its suitability
for rapid prototyping the ability to quickly
write a working program Also, Perl is excellent
for string manipulation, and bioinformatics deals
mainly with large strings of genetic sequences
and base pairs.
9How to Use Perl and BioPerl
Perl is mainly for use with Unix/Linux based
operating systems however, you can install both
for every windows based operating system. I
will be showing you how to install Perl/Bioperl
on Win2k and XP systems, because this is what I
installed it on and realized that there werent
any real good installation directions for Windows
based systems.
10First Download Perl
- Visit http//www.activestate.com/Products/Active
Perl/?_x1 - Download Perl Activestate Click to download ?
- Run the Setup and Follow the Directions
- Verify that Perl was Installed
11This will setup Perl on your computer.
- It should setup a Folder on your hard-drive
(Typically C\Perl) unless otherwise changed.
This folder contains all the needed modules and
libraries to run Perl on your system. - Running Perl on Windows operating systems
requires the use of MS_Dos so get used to the
command line because this is what you will be
using to run all your Perl Scripts.
12TEST SCRIPT
All of your Perl Scripts can be easily written in
a basic word processing program such as NOTEPAD
Then saved with the .pl ending make sure that
the Save As Type is set for All Files
13Lets develop a simple Hello World Test Program
- In Notepad simply Type print Hello World
- Then Save as test.pl
- Now lets test to see if the script worked. 1.)
Open up MS-DOS prompt and type cd\ HIT ENTER - 2.) Type in? perl C\windows\desktop\test.pl
(or wherever you saved your test.pl) HIT ENTER - 3.) Should print out Hello World on the screen!
Its as easy as that! - Now it will get a little bit more complicated
Next we will install Bioperl- this is what caused
me the most trouble! However, once I figured it
out, it was fairly simple.
14BioPerl
Click Here for Newest Release
http//www.bioperl.org/Core/Latest/
- Download BioPerl and Unpack using Winzip (or
another extracting tool) Extract to Perl Lib
Directory (typically C\Perl\lib) - Download Nmake and Save in Perl Lib Directory
- Install Bioperl and its modules
http//download.microsoft.com/download/vc15/Patch/
1.52/W95/EN-US/Nmake15.exe
http//www.bioperl.org/Core/windows-bioperl.html
FOR MORE INFO?
15Detailed Instructions
- 1. Open Up MS-Dos prompt and type cd C\Perl\Lib
HIT ENTER - 2. Now Type perl Makefile.PL You will have to
specify the directory HIT ENTER - 3. You type nmake ENTER
- 4. NEXT nmake test ENTER
- 5. Finally nmake install ENTER
- Its that easynow to install other modules not
included in the bioperl package follow the
directions below using PPM!!!!
16To use PPM
- Just go to the DOS command line and type "PPM"
(without the quotes). You will be at the PPM
command prompt. (I should mention that you need
to be connected to the Internet at this point). - At the PPM prompt, enter "install YOURMODNAME"
- you will be prompted if you want to continue.
At that point, PPM will connect to ActiveState
and see if the module you requested is available
in a pre-compiled form. If it is, it will install
the module and you are all done! PPM is
especially nice because it will even install
other required modules for you. - If you get an error message that the module was
not found, then it's not available from
ActiveState and you will have to find it
elsewhere. - NOW BIOPERL IS SUCCESSFULLY INSTALLED!!!
17Perl (BioPerl) Examples
- Easy to learn
- Rapid Prototyping
FASTA Format
18Fasta.pl Only 9 Lines of Code use strict use
warnings my _at_file_data () my dna
'' _at_file_data get_data("C\\Perl\\bin\\BioInfo
\\sample.dna.txt") dna extract_data(_at_file_data
) print_sequence(dna,25) exit
Hardcode Location
19Run Perl Script Revcomp.pl
http//jje.uchicago.edu/revcomp.pl
Reverse Complimentary Strands of the FASTA format
we just made!
20Read and Practice
To develop Computer Programming Skills and the
abilities to develop your own scripts, you must
first learn how to program and to do this I
recommend reading
21References
- Ezzell, C. (2000). Hooking up Biologists.
Scientific American 283 (5), 22. - Gibas, C., and Jambeck, P. (2001). Developing
Bioinformatics Computer Skills. Sebastopol
OReilly. - Roos, D. (2001). Bioinformatics Trying to Swim
in a Sea of Data. Science 291 (5507), 1260 1261. - Stewart, B (2001, December 7). An Interview with
Lincoln Stein. Retrieved April 14, 2002 from the
World Wide Web http//www.oreillynet.com/pub/a/ne
twork/2001/12/07/stein.html - Tisdall, J. (2001, October 15). Why Biologists
Want to Program Computers. Retrieved April 12,
2002 from the World Wide Web http//www.oreilly.c
om/news/perlbio_1001.html - Tisdall, J. (2001). Beginning Perl for
Bioinformatics. Sebastopol OReilly.