Browse Articles

Article|19 Feb 2022|OPEN
TEsorter: An accurate and fast method to classify LTR-retrotransposons in plant genomes
Ren-Gang Zhang1,2 , ,† , Guang-Yuan Li2 ,† , Xiao-Ling Wang3 , Jacques Dainat4 , Zhao-Xuan Wang5 and Shujun Ou6 , , Yongpeng Ma,1 ,
1Yunnan Key Laboratory for Integrative Conservation of Plant Species with Extremely Small Populations, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
2Department of Bioinformatics, Ori (Shandong) Gene Science and Technology Co., Ltd., Weifang, Shandong 261322, China
3BGI-Shenzhen, Shenzhen 518083, China
4Department of Medical Biochemistry and Microbiology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
5Shijiazhuang People’s Medical College, Shijiazhuang, Hebei 050091, China
6Department of Ecology, Evolution, and Organismal Biology (EEOB), Iowa State University, Ames, IA 50010, USA
*Corresponding author. E-mail: zhangrengang@ori-gene.cn,oushujun@iastate.edu,mayongpeng@mail.kib.ac.cn
Both authors contributed equally to the study.

Horticulture Research 9,
Article number: uhac017 (2022)
doi: https://doi.org/10.1093/hr/uhac017
Views: 24

Received: 07 Oct 2021
Accepted: 23 Dec 2021
Published online: 19 Feb 2022

Abstract

Dear Editor,

Transposable elements (TEs) constitute the largest portion of repetitive sequences in many eukaryotic genomes, with long terminal repeat retrotransposons (LTR-RTs) being predominant in plant genomes. Various tools have been developed for the identification and classification of TEs, including RepeatModeler [1], REPET [2], LTR_retriever (https://github.com/oushujun/LTR_retriever), and TERL (https://github.com/muriloHoracio/TERL). To our knowledge, most existing software can only classify TEs to the superfamily level, in particular the LTR-RT Copia and Gyspy superfamilies in plants, leaving a significant knowledge gap. Moreover, although approaches for automated classification of LTR lineages using amino acid hidden Markov models (HMMs) do exist, these are typically comprised of collections of scripts that are not curated or specifically designed to be user-friendly.