A systems biology approach to the identification and analysis of transcriptional regulatory networks - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

A systems biology approach to the identification and analysis of transcriptional regulatory networks

Description:

A putative model of a transcriptional network ... Putative model used to generate hypotheses ... to construct a putative transcriptional regulatory network ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 23
Provided by: COPMc
Learn more at: http://www.cs.utsa.edu
Category:

less

Transcript and Presenter's Notes

Title: A systems biology approach to the identification and analysis of transcriptional regulatory networks


1
A systems biology approach to the identification
and analysis of transcriptional regulatory
networks in osteocytes
  • Angela K. Dean, Stephen E. Harris, Jianhua Ruan

2
Overview
  • Osteocytes Background Motivation
  • Review of Biological Central Dogma
  • Osteoctye gene set derivation
  • Osteocyte purification
  • Microarray experiments
  • Functional annotation analysis
  • Sequence Analysis of promoter regions
  • Construction of regulatory network
  • Partitioning to define cis-regulatory modules
  • Results

3
Background Cellular functions
  • Certain types of cells perform specific
    biological functions
  • Key genes must be activated to perform correctly
  • Osteocytes play an essential role in regulating
    bone formation and remodeling
  • We want to identify these key genes and the
    activators of these genes

4
Why study osteocyte cells?
  • Identifying these key genes (and their
    activators) involved in the bone-formation
    process may lead to new targeted therapies
  • For osteoporosis, loss of bone in space travel,
    extended bed rest, etc.

5
Molecular Biology Central Dogma
6
  • We want to identify these associations between
    Transcription Factors and the genes that they
    regulate in order to build a transcriptional
    regulatory network

7
Osteocyte cells are hard to isolate
  • Embedded within the bone matrix, and lacking
    molecular and cell surface markers, they are
    seemingly inaccessible
  • How to characterize and isolate these cells?
  • Solution create special mouse that contains
    inserted special gene that drives fluorescence
    in osteocytes

8
Isolating osteocytes
  • Osteocytes are known to highly express Dentin
    matrix protein 1 (DMP1)
  • A transgene was created with the same promoter
    (activation) region as DMP1 that drives GFP, then
    inserted into this transgenic mouse
  • Cells that highly express DMP1 (osteocytes) will
    also drive GFP
  • We can now purify osteocytes from other cells
    using fluorescence-activated cell sorting

9
Identifying key osteocyte genes using microarray
  • Microarray experiments allow us to measure the
    activity of genes (expression profile)
  • We compared the expression profiles of the
    purified osteocyte cells (GFP) to non-osteocyte
    cells (-GFP)
  • Identified the top 269 genes expressed gt 3 fold
    in the GFP as compared to GFP (FDR-corrected
    p-value lt 0.05)

10
Identifying functionally-related osteocyte genes
  • Each of the 269 genes has one or more GO terms or
    PIR-keywords associated with it
  • Gene Ontology (GO) terms describe biological
    processes, cellular components and molecular
    functions
  • Protein Information Resource (PIR) keyword is an
    annotation from the PIR database

11
Functional Annotation Clustering
  • For each GO term associated with a gene or group
    of genes within the 269 set, a p-value is
    computed using hypergeometric dist. and adjusted
    for multiple testing using Benjamini method
  • Enrichment score per cluster is the geometric
    mean of the indivual GO p-vals.
  • DAVID Bioinformatics Tool was used for the
    clustering

12
Functional annotation clustering results
  • As expected, most enriched clusters relate to
    extracellular region, system development,
    etc.
  • Cluster 2 relates to bone, and interestingly,
    Cluster 5 relates to muscle
  • We narrowed our 269 gene set to these 98 genes
    corresponding to bone and muscle

13
Identifying TF Binding Sites in the 98 gene set
  • We searched the 5kb promoter sequence upstream to
    TSS of each gene for known TF binding motifs from
    TRANSFAC db, using rVista tool
  • Filtered the TF motifs to keep only those
    conserved between mouse and human genomes
  • Conserved motifs increase confidence

14
Identifying TF Binding Sites in the 98 gene set
  • Many motifs identified related to bone muscle
  • 67 of the 98 genes contained over 10 conserved
    Mef2 binding sites in their promoters
  • Bone muscle genes and their number of conserved
    Mef2 binding sites

15
Building the transcriptional regulatory network
  • Created a network consisting of the 98 gene set
    and their conserved and enriched TFs as nodes
  • An edge between a gene and a TF represents the
    statistically significant presence of that TFs
    binding site on the promoter of that gene
  • TFs filtered using conservation AND enrichment
    to produce more reliable edges and reduce noise
  • Enrichment of a TF motif is determined by a
    p-value based on the of occurrences in the 5kb
    upstream of this gene, as compared to the of
    occurrences in the 5kb upstream of the rest of
    the genes in the genome

16
Modular structure of the regulatory network
  • Final network consisted of 98 genes and 153
    conserved and over-represented TFs
  • To identify possible combinatorial effects of
    TFBS, we partitioned the genes in the network
    using the Q-Cut algorithm
  • Q-Cut is a graph partitioning algorithm for
    finding dense subnets (i.e., communities).
    Optimizes a statistical score called the
    modularity, and automatically determines the most
    appropriate number of communities

17
  • We reduced noise and created a more sparse
    gene-gene network for better partitioning
  • We created this temporary network by assigning a
    cosine similarity score to each pair of genes
    according to their shared TFs.
  • Cosine similarity is a measure of similarity
    between two vectors (each vector contains 153
    slots for the 153 enriched TFs in the 98 gene
    set)
  • Edges between genes represent their similarity
    score, and this net was converted to a sparse net
    by connecting each gene to its k nearest
    neighbors (k7) and employing a similarity score
    cutoff of 0.5

18
Identifying modules in the initial regulatory
network
  • Q-Cut was then applied to this gene-gene network,
    resulting in communities with many common TF
    binding sites

19
Interesting clusters
  • Cluster below shows a strong community structure
    between 16 genes and their common TFBS
  • Representative of many TFs coordinately
    regulating a small set of genes

20
A putative model of a transcriptional network
  • A proposed model was built using the network
    results
  • DMP1 Sost (highly expr. in osteocytes) are
    shown to be regulated by Mef2 and Myogenin

21
Putative model used to generate hypotheses
  • We now have an ex vivo system for pure osteocytes
    in a proper microenvironment to conduct
    experimental validation based on this model
  • Here the osteocytes will make appropriate levels
    of osteocyte-specific genes
  • Experiments are currently underway

22
Conclusions
  • We used a systems biology method to construct a
    putative transcriptional regulatory network model
    for osteocytes, by integrating
  • Microarray data
  • Functional annotation
  • Comparative genomics
  • Graph-theoretic knowledge
  • Many parts of the network can be confirmed by the
    literature
  • Experiments are currently underway to further
    validate the model
Write a Comment
User Comments (0)
About PowerShow.com