REPET evolutions: faster and easier
2022
WAN, Mariène | Confais, Johann | Quesneville, Hadi | Unité de Recherche Génomique Info (URGI) ; Université Paris-Saclay-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE) | IR BioInfOmics ; Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE)
International audience
Show more [+] Less [-]English. Transposable elements (TEs) are major players of structure and evolution of eukaryote genomes. Thanks totheir ability to move around and to replicate within genomes, they are probably the most important contributorsto genome plasticity. Their detection and annotation are considered essential and must be undertaken in anygenome sequencing project.The REPET package [1, 2] integrates bioinformatics pipelines dedicated to detect, annotate and analyzeTEs in genomic sequences. The two main pipelines are (i) TEdenovo, that search for interspersed repeats,build consensus sequences and classify them [3] according to TE features and (ii) TEannot, which mines agenome with a library of TE sequences, for instance the one produced by the TEdenovo pipeline, to provideTE annotations.The REPET package is in continuous improvement for speed by parallelizing several key bottleneck steps.In addition, several strategies which reduce the time required for analyzing large genome have been tested.With the speed improvement and adapted strategies, REPET is now able to annotate and analyze genomessuch as the maize with more than 85% of TEs on a 2.3 Gb genome [4] on current computer cluster.With this tool, the PlantBioinfoPF platform ensures a TE annotation service. Indeed, we are now able topropose an automatic TE annotation of good quality through a process called "Repet-Factory". This processuses the REPET software suite with parameters optimized for TE detection specificity and computing time.This process is capable of successively annotate several genomes in batches with the required traceability andreproducibility of the analyzes.Moreover, a Virtual Research Environment (VRE) for TE annotation and its analysis has been developedon Virtual Machines (VM). An ansible script instantiate VMs with all packages and tools required for acomplete genome annotation with the REPET package. This script allows this VRE to be easily re-instantiatedin other infrastructures which greatly simplify the REPET package installation with all its requireddependencies. We also simplified the distribution of REPET to increase its availability and portability to users,by developing a Docker image of REPET (https://hub.docker.com/r/urgi/docker_vre_aio).The REPET tool is a cornerstone of the platform. In addition to its use in the genome TE annotation serviceand its availability for download, it is also the basis of the RepetDB [5] database(https://urgi.versailles.inrae.fr/repetdb) hosted by the platform which provides libraries of reference TEsequences for more than 50 species.References1. Flutre T, Duprat E, Feuillet C, Quesneville H (2011) Considering Transposable Element Diversification in DeNovo Annotation Approaches. PLoS ONE 6(1): e16526. https://doi.org/10.1371/journal.pone.00165262. Quesneville H, Bergman CM, Andrieu O, Autard D, Nouaud D, Ashburner M, et al. (2005) Combined EvidenceAnnotation of Transposable Elements in Genome Sequences. PLoS Comput Biol 1(2): e22.https://doi.org/10.1371/journal.pcbi.00100223. Hoede C, Arnoux S, Moisset M, Chaumier T, Inizan O, Jamilloux V, et al. (2014) PASTEC: An AutomaticTransposable Element Classification Tool. PLoS ONE 9(5): e91929. https://doi.org/10.1371/journal.pone.00919294. V. Jamilloux, J. Daron, F. Choulet and H. Quesneville, "De Novo Annotation of Transposable Elements:Tackling the Fat Genome Issue," in Proceedings of the IEEE, vol. 105, no. 3, pp. 474-481, March 2017, doi:10.1109/JPROC.2016.2590833.5. Amselem, J., Cornut, G., Choisne, N., Alaux, M., Alfama-Depauw, F., Jamilloux, V., Maumus, F., Letellier, T.,Luyten, I., Pommier, C., Adam-Blondon, A. F., & Quesneville, H. (2019). RepetDB: a unified resource for transposableelement references. Mobile DNA, 10, 6. https://doi.org/10.1186/s13100-019-0150-y
Show more [+] Less [-]AGROVOC Keywords
Bibliographic information
This bibliographic record has been provided by Institut national de la recherche agronomique