A Statistical Framework for eQTL Mapping Using RNAâseq Data
2012
Sun, Wei
RNAâseq may replace gene expression microarrays in the near future. Using RNAâseq, the expression of a gene can be estimated using the total number of sequence reads mapped to that gene, known as the total read count (TReC). Traditional expression quantitative trait locus (eQTL) mapping methods, such as linear regression, can be applied to TReC measurements after they are properly normalized. In this article, we show that eQTL mapping, by directly modeling TReC using discrete distributions, has higher statistical power than the twoâstep approach: data normalization followed by linear regression. In addition, RNAâseq provides information on alleleâspecific expression (ASE) that is not available from microarrays. By combining the information from TReC and ASE, we can computationally distinguishâcisâ andâtransâeQTL and further improve the power ofâcisâeQTL mapping. Both simulation and real data studies confirm the improved power of our new methods. We also discuss the design issues of RNAâseq experiments. Specifically, we show that by combining TReC and ASE measurements, it is possible to minimize cost and retain the statistical power ofâcisâeQTL mapping by reducing sample size while increasing the number of sequence reads per sample. In addition to RNAâseq data, our method can also be employed to study the genetic basis of other types of sequencing data, such as chromatin immunoprecipitation followed by DNA sequencing data. In this article, we focus on eQTL mapping of a single gene using the associationâbased method. However, our method establishes a statistical framework for future developments of eQTL mapping methods using RNAâseq data (e.g., linkageâbased eQTL mapping), and the joint study of multiple genetic markers and/or multiple genes.
Mostrar más [+] Menos [-]Palabras clave de AGROVOC
Información bibliográfica
Este registro bibliográfico ha sido proporcionado por National Agricultural Library