|
3.1. Development and
distribution of data analysis software.
Responsible:
Dr.Tim
Beissbarth, DKFZ, Heidelberg.
Background:
We began the
development and distribution of data analysis software within the
context of NGFN-1, with the goal of providing practical exercises for
our Microarray Data Analysis courses and to furnish other researchers
with the results of our theoretical work. Within the collaborative
Bioconductor (www.bioconductor.org) framework we have produced and are
maintaining numerous software packages. More specifically, we have
developed a workflow oriented data-mining platform that integrates
methods from statistics and machine-learning into one common framework.
The general goal of this particular sub-project is thus to utilize
this framework in enabling the experimental labs in the NGFN to
analyse their profiling data themselves.
Planned work:
The primary goal of
our work is to provide technologies that permit high-quality and
standardised analysis of NGFN-2 data. Analysis questions are often
complex and specialized towards particular biological or clinical
applications and related methods are often equally complex and
numerous. We therefore aim to provide standardized, re-usable
workflows integrated into Mine-IT for distribution into NGFN nodes. To
address the need for statistical analysis of graph-like data sets (e.g.
protein interaction data from protein arrays or mass spectrometry) and
network-like metadata (e.g. regulatory or signalling pathways), we
will further develop software modules for the statistical and graph
theoretical analysis and visualisation of graphs. In particular, we
will enhance the Bioconductor packages graph and Rgraphviz.
|