Functional genome of Spondias tuberosa Arr. Cam. (umbu)
Molecular biology. Functional annotation. Genetics.
Spondias tuberosa Arr. Cam. (umbu) is a species of tree known in northeastern Brazil for
being an alternative and Brazilian economic resource in semi-arid regions. Thinking about
carrying out the functional annotation of the genome of the species, the present study aimed to
present the number of genes involved in biological processes, cellular components and
molecular function, in addition, the codes of enzymes and metabolic pathways. For this, the
DNA was extracted by the method of long reads using Pacbio Assembly. After detecting
repetitive structures, using GeneMask, the structural annotation was assembled using ab initio
gene predictors (Augustus and GeneMarker) and artificial intelligence (Helixer), for the
qualification of genome assembly, the BUSCO tool was used. Functional annotation obtained
data generated by the Blast2GO platform, genes were annotated using Gene Orthology (GO)
categories, as well as enzymes and metabolic pathways. BUSCO's evaluation of the assembly
generated by Helixer resulted in an accuracy of 98% completeness. The predicted genes were
distributed in 51,896 total GO terms, distributed in three main categories: molecular function,
biological process and cellular component. Enzymes were divided into 7 categories and
metabolic pathways into 5 categories. In the present study, it was noted that the genes of the
biological process involved in the stress response resulted in the largest number of genes for
this category in the genome of S. tuberosa. Also revealing that the amount of genes involved
in the response to stress is greater than the other species of the same family.