From gene expression modeling to gene network to investigate Arabidopsis thaliana genes involved in stress response
2015
Zaag, Rim | Tamby, Jean-Philippe | Guichard, Cécile | Tariq, Zakia | Aubourg, Sébastien | Brunaud, Véronique | Delannoy, Etienne | Marie-Laure, Martin-Magniette | Unité de recherche en génomique végétale (URGV) ; Institut National de la Recherche Agronomique (INRA)-Université d'Évry-Val-d'Essonne (UEVE)-Centre National de la Recherche Scientifique (CNRS) | Institut des Sciences des Plantes de Paris-Saclay (IPS2 (UMR_9213 / UMR_1403)) ; Institut National de la Recherche Agronomique (INRA)-Université Paris-Sud - Paris 11 (UP11)-Université Paris Diderot - Paris 7 (UPD7)-Université d'Évry-Val-d'Essonne (UEVE)-Centre National de la Recherche Scientifique (CNRS) | Institut de Recherche en Horticulture et Semences (IRHS) ; Université d'Angers (UA)-Institut National de la Recherche Agronomique (INRA)-AGROCAMPUS OUEST | Biologie et Amélioration des Plantes (BAP) ; Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE) | Mathématiques et Informatique Appliquées (MIA-Paris) ; Institut National de la Recherche Agronomique (INRA)-AgroParisTech
International audience
Mostrar más [+] Menos [-]Inglés. The gap between the structural annotati on of a genome and the functi onal one sti ll remains wide. Recent studies have esti mated that 20% to 40% of the predicted genes have no assigned functi on in eukaryoti c organisms whose genome is completely sequenced. Transcriptome data allow investi gati ng the gene behaviors and co-expression studies have rapidly been considered as a way to identi fy sets of candidate gene modules. Generally co-expression is established by analyzing correlati ons between all gene pairs in multi ple microarray experiments collected from public repositories. Such approaches may suff er from both heterogeneity of data and the choice of the clustering method, usually based on gene pairs. Tackling these limitati ons, we propose an analysis based on a large and homogeneous set of transcriptome data extracted from CATdb: 387 stress conditi ons organized into 9 bioti c and 9 abioti c stress categories. Instead of correlati on analysis, a model-based clustering was applied to identi fy clusters of co-expressed genes per stress category. Various resources were then analyzed and integrated to characterize functi ons associated with genes in these clusters. Protein-protein interacti ons and transcripti on factors-targets interacti ons were exploited to display gene networks. All the results are stored and managed in GEM2Net, a new module of CATdb (Zaag et al., 2015). We are currently demonstrati ng that this resource provides a valuable starti ng point to study stress responses and to propose a high-throughput functi onal annotati on of Arabidopsis thaliana genome.
Mostrar más [+] Menos [-]Información bibliográfica
Este registro bibliográfico ha sido proporcionado por AgroParisTech