|
2.3. Method development for
enhanced biological interpretation of gene expression data.
Responsible:
Jörg
Rahnenführer, MPI, Saarbrucken.
Background:
Increasing evidence
suggests that only local structures of biological networks can be
recovered from gene expression data. Taking known structural,
regulatory or enzymatic roles of the corresponding proteins into
consideration can improve the functional interpretation of the results
significantly. Our methods can directly be extended and applied to
other genomic high-throughput data, e.g. microarray-based CGH, a
technique that will become increasingly important during NGFN2, with
applications in cancer and medical genetics. Bayesian networks are
graphical representations of the conditional independence structure
among a set of variables, which are fitted to measured input-output
behavior of a biological system, e.g. a part of a biochemical network
model. Such measurements will be produced within NGFN2 by knock-down
experiments applying RNAi technology, combined application of genomic
(cytogenetic, CGH) and transcriptomic or proteomic investigations,
epigenetic profiling and other methods. We aim to reconstruct small
local networks by using Bayesian network inference based on such
molecular data. Assessment of model fit and re-modeling will give
further insights into cross-talk between pathways as well as subsystem
properties.
Planned work:
Our primary objectives
are the development and validation of methods that generate functional
profiles with high biological or medical relevance. These profiles
refer to gene sets with known biological meaning rather than to single
genes. This provides fully interpretable snapshots of gene expression
data obtained under specific conditions. A major application will be
the characterization of disease types on a functional level. More
specific goals are:
-
Identification of useful measures for
co-regulation of genes in order to score whole sets of genes in
microarray experiments and calculation of significance scores for
such gene sets.
-
Development of validation procedures
that are based on biological rather than purely statistical
criteria.
-
Extension of scoring approach to
discriminate between different phenotypes; development of scoring
functions with discriminatory power.
-
Application of methods to other types
of biological information like GO and MIPS annotations. The choice
of meaningful gene sets will be done cooperation with SP 2.4 Gene
Set Analysis.
-
Extension of the existing methods
towards other multivariate high-throughput data given in matrix
form, like CGH data, adapted to up-to-date demands within NGFN-2.
-
Modelling of local pathways, like
parts of the EGF/MAP kinase pathway or tyrosine kinase dependent
cell proliferation signals for inferring small parts of
biochemical networks, inference of small netwoeks with Bayesian
inference.
-
Conduction of case studies on a
variety of real gene expression data from NGFN-2 members and
evaluation of the biological findings.
|
|