Grid-based
expression QTL Analysis

 Ann-Kristin Grimm, Steffen Möller

          University of Lübeck
Institute for Neuro- and Bioinformatics




                                          BOSC09
                                          Stockholm
Statistical genetics for complex diseases
●   Isolation of homozygous strains
    ●   susceptible or
    ●   resistant
    to disease
●   Identification of chromosomal markers that
    differ between strains
●   Generating offspring with mixed genotypes,
    score the disease phenotypes of every
    individual
●   Determine statistical association of markers
    with score                                  BOSC09
                                                 Stockholm
Back-Cross: F1 x P




                     BOSC09
                     Stockholm
Analysis
●   Markers sufficiently dense to spot all X-overs
●   Interval-mapping: neighbouring markers
    ●   Both homozygous: no X-over
    ●   Both heterozygous: no X-over
    ●   One hetero-, one homozygous: X-over in between
●   Continuous refinements
    ●   Steady (directed) increase of markers
    ●   Additional phenotypes being investigated
    ●   Additional mice being bread
        –   Stronger statistics
        –   More cross-overs
                                                   BOSC09
                                                   Stockholm
Peak: inferring ML of disease location
●   A stretch between two markers
    may be found to be influencing the




                                                           strong
    score




                                                  Effect
●   The position of the controlling
    locus may be estimated from




                                                           weak
    ●   the fraction of individuals with X-over
    ●   that show the same effect
                                                                A/A    A/B    A/A
                                                                A/A    A/B    A/B
●   The exact locus remains undefined                           Marker-Combination

●   … but what if we have more molecular scores ...?
                                                                         BOSC09
                                                                         Stockholm
Expression QTL
●   Use gene expression levels as scores
●   Disease phenotypes + sex + mitochondrial
    inheritance are covariates
●   Every locus is described by
    ●   genes that it controls
    ●   pathways/GO terms/TFBS/miRNA these share
    ●   Super-linear (epistatic) effects with other loci
●   Conversely genes → genetic loci
●   Direct effects from locus → gene
●   Singular effects are of interest, even should the
    QTG never be determined                      BOSC09
                                                 Stockholm
Huge amount of data
●   Compute time so long you don't want to do this
    twice.

    30000 genes
    x 200 markers (^2 for interactions)
    x 150 individuals
    x 20 phenotypes (^2)

●   But for better insights biologists do – with
    updated data.

                                                   BOSC09
                                                   Stockholm
Increased communication through
         distributed computation
●   Dynamic website to
    ●   Ship raw data to
        compute nodes
    ●   Humanely present
        (interim) results of
        computations
●   Grid jobs retrieve
    series of work units
●   Continuous input to
    biological researchers
                                BOSC09
                                Stockholm
Please join in: http://eqtl.berlios.de




Acknowledgments
Programming: Ann-Kristin Grimm, Jan Kolbaum, Hajo Krabbenhöfft, Patrick Wernhoff
Data: Maja Jagodic, Mèlanie Thessèn-Hedreul, Dirk Koczan, Saleh Ibrahim
Computations: Olli Tourhunen and the NDGF                               BOSC09
                                                                     Stockholm

Moeller_GridQTL_BOSC2009

  • 1.
    Grid-based expression QTL Analysis Ann-Kristin Grimm, Steffen Möller University of Lübeck Institute for Neuro- and Bioinformatics BOSC09 Stockholm
  • 2.
    Statistical genetics forcomplex diseases ● Isolation of homozygous strains ● susceptible or ● resistant to disease ● Identification of chromosomal markers that differ between strains ● Generating offspring with mixed genotypes, score the disease phenotypes of every individual ● Determine statistical association of markers with score BOSC09 Stockholm
  • 3.
    Back-Cross: F1 xP BOSC09 Stockholm
  • 4.
    Analysis ● Markers sufficiently dense to spot all X-overs ● Interval-mapping: neighbouring markers ● Both homozygous: no X-over ● Both heterozygous: no X-over ● One hetero-, one homozygous: X-over in between ● Continuous refinements ● Steady (directed) increase of markers ● Additional phenotypes being investigated ● Additional mice being bread – Stronger statistics – More cross-overs BOSC09 Stockholm
  • 5.
    Peak: inferring MLof disease location ● A stretch between two markers may be found to be influencing the strong score Effect ● The position of the controlling locus may be estimated from weak ● the fraction of individuals with X-over ● that show the same effect A/A A/B A/A A/A A/B A/B ● The exact locus remains undefined Marker-Combination ● … but what if we have more molecular scores ...? BOSC09 Stockholm
  • 6.
    Expression QTL ● Use gene expression levels as scores ● Disease phenotypes + sex + mitochondrial inheritance are covariates ● Every locus is described by ● genes that it controls ● pathways/GO terms/TFBS/miRNA these share ● Super-linear (epistatic) effects with other loci ● Conversely genes → genetic loci ● Direct effects from locus → gene ● Singular effects are of interest, even should the QTG never be determined BOSC09 Stockholm
  • 7.
    Huge amount ofdata ● Compute time so long you don't want to do this twice. 30000 genes x 200 markers (^2 for interactions) x 150 individuals x 20 phenotypes (^2) ● But for better insights biologists do – with updated data. BOSC09 Stockholm
  • 8.
    Increased communication through distributed computation ● Dynamic website to ● Ship raw data to compute nodes ● Humanely present (interim) results of computations ● Grid jobs retrieve series of work units ● Continuous input to biological researchers BOSC09 Stockholm
  • 9.
    Please join in:http://eqtl.berlios.de Acknowledgments Programming: Ann-Kristin Grimm, Jan Kolbaum, Hajo Krabbenhöfft, Patrick Wernhoff Data: Maja Jagodic, Mèlanie Thessèn-Hedreul, Dirk Koczan, Saleh Ibrahim Computations: Olli Tourhunen and the NDGF BOSC09 Stockholm