1 of 12 - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

1 of 12

Description:

MySQL: database table dumps. GTF: gene sets in GTF format ... Your MySQL connection id is 1699364 to server version: 4.1.20. standard-log ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 13
Provided by: bertov
Category:
Tags: mysql

less

Transcript and Presenter's Notes

Title: 1 of 12


1
Advanced Access
Mini module
2
Access to Genome Annotation
  • Release web site
    http//www.ensembl.org/
  • Pre-Release
    http//pre.ensembl.org/
  • Archive
    http//archive.ensembl.org/
  • BioMart http//www.ensembl.org/Mul
    ti/martview/

  • http//www.biomart.org/biomart/martview/
  • Downloads
    ftp//ftp.ensembl.org/
  • MySQL interface
    ensembldb.ensembl.org
  • Perl API http//www.ensembl.org/in
    fo/software/

3
Downloads
  • ftp//ftp.ensembl.org/pub/
  • http//www.ensembl.org/info/downloads/ftp_site.htm
    l
  • FASTA files plain sequence
  • DNA (assembly masked and unmasked)
  • cDNA (Ensembl and ab initio predictions)
  • Peptides (Ensembl and ab initio predictions)
  • RNA (non-coding RNA predictions)
  • Flatfiles annotated 1Mb slices
  • EMBL format
  • GenBank format
  • MySQL database table dumps
  • GTF gene sets in GTF format
  • EMF alignments of resequencing data in Ensembl
    Multi Format

4
Ensembl Databases
  • Species-specific databases
  • Core genomic sequences and
  • most of the annotation
  • Variation genetic variation
  • Funcgen regulatory elements
  • Otherfeatures EST genes
  • Vega Vega genes
  • Cross-species database
  • Compara all comparative data

5
MySQL
  • SQL Structured Query Language
  • Needed
  • MySQL client program
  • http//www.mysql.com
  • Ability to write MySQL queries
  • Knowledge of database schema

6
MySQL
7
MySQL
  • Retrieve Ensembl Transcript and Peptide IDs for
    ENSG00000010704
  • mysql -u anonymous -h ensembldb.ensembl.org
  • Welcome to the MySQL monitor. Commands end with
    or \g.
  • Your MySQL connection id is 1699364 to server
    version 4.1.20
  • standard-log
  • Type 'help' or '\h' for help. Type '\c' to clear
    the buffer.
  • mysqlgt use homo_sapiens_core_41_36c
  • Reading table information for completion of table
    and column names
  • You can turn off this feature to get a quicker
    startup with -A
  • Database changed
  • mysqlgt SELECT gene_stable_id.stable_id AS gene,
    transcript_stable_id.stable_id AS transcript,
    translation_stable_id.stable_id AS peptide FROM
    gene, transcript, translation, gene_stable_id,
    transcript_stable_id, translation_stable_id WHERE
    gene.gene_id transcript.gene_id AND
    transcript.transcript_id translation.transcript_
    id AND gene_stable_id.gene_id gene.gene_id AND
    transcript_stable_id.transcript_id
    transcript.transcript_id AND translation_stable_id
    .translation_id translation.translation_id AND
    gene_stable_id.stable_id 'ENSG00000010704'

8
MySQL
Result --------------------------------------
------------- gene transcript
peptide ---------------------------
------------------------ ENSG00000010704
ENST00000309234 ENSP00000311698
ENSG00000010704 ENST00000349999
ENSP00000259699 ENSG00000010704
ENST00000317896 ENSP00000313776
ENSG00000010704 ENST00000353147
ENSP00000312342 ENSG00000010704
ENST00000352392 ENSP00000315936
ENSG00000010704 ENST00000336625
ENSP00000337819 ENSG00000010704
ENST00000345823 ENSP00000344033
ENSG00000010704 ENST00000357618
ENSP00000350238 ENSG00000010704
ENST00000317880 ENSP00000313489
---------------------------------------------
------
9
Perl API
  • API Application Programming Interface
  • Needed
  • BioPerl modules
  • Ensembl modules
  • Ability to code in Perl
  • For more information (installation instructions,
  • tutorials, documentation etc.)
  • http//www.ensembl.org/info/using/api/index.html

10
Perl API

Retrieve Ensembl Transcript and Peptide IDs for
ENSG00000010704 !/usr/local/ensembl/bin/perl us
e strict use warnings use BioEnsEMBLRegistr
y my reg "BioEnsEMBLRegistry" reg-gtload
_registry_from_db( -host gt 'ensembldb.ensembl.or
g', -user gt 'anonymous') my gene_adaptor
reg-gtget_adaptor ("human", "core", "Gene") my
gene gene_adaptor-gtfetch_by_stable_id('ENSG000
00010704') my _at_transcripts _at_gene-gtget_all_Tra
nscripts() print "Gene\t\tTranscript\tPeptide\n
" foreach my transcript(_at_transcripts)
print gene-gtstable_id, "\t", transcript-gtstable
_id, "\t", transcript-gttranslation-gtstab
le_id, "\n"
11
Perl API
Result Gene Transcript
Peptide ENSG00000010704 ENST00000309234
ENSP00000311698 ENSG00000010704 ENST00000349999
ENSP00000259699 ENSG00000010704 ENST00000317896
ENSP00000313776 ENSG00000010704 ENST00000353147
ENSP00000312342 ENSG00000010704 ENST00000352392
ENSP00000315936 ENSG00000010704 ENST00000336625
ENSP00000337819 ENSG00000010704 ENST00000345823
ENSP00000344033 ENSG00000010704 ENST00000357618
ENSP00000350238 ENSG00000010704 ENST00000317880
ENSP00000313489
12
Q

A
Q U E S T I O N S A N S W E R S
Write a Comment
User Comments (0)
About PowerShow.com