Full-length transcriptome analysis of pecan (Carya illinoinensis) kernels

Abstract
Pecan is rich in bioactive components such as fatty acids (FAs) and flavonoids and is an important nut type worldwide. Therefore, the molecular mechanisms of phytochemical biosynthesis in pecan are a focus of research. Recently, a draft genome and several transcriptomes have been published. However, the full-length mRNA transcripts remain unclear, and the regulatory mechanisms behind the quality components biosynthesis and accumulation have not been fully investigated. In this study, single-molecule long-read sequencing technology was used to obtain full-length transcripts of pecan kernels. In total, 37,504 isoforms of 16,702 genes were mapped to the reference genome. The numbers of known isoforms, new isoforms, and novel isoforms were 9013 (24.03%), 26,080 (69.54%), and 2411 (6.51%), respectively. Over 80% of the transcripts (30,751, 81.99%) had functional annotations. A total of 15,465 alternative splicing (AS) events and 65,761 alternative polyadenylation events were detected; wherein, the retained intron was the predominant type (5652, 36.55%) of AS. Furthermore, 1894 long noncoding RNAs and 1643 transcription factors were predicted using bioinformatics methods. Finally, the structural genes associated with FA and flavonoid biosynthesis were characterized. A high frequency of AS accuracy (70.31%) was observed in FA synthesis-associated genes. This study provides a full-length transcriptome data set of pecan kernels, which will significantly enhance the understanding of the regulatory basis of phytochemical biosynthesis during pecan kernel maturation.
Funding Information
  • Fundamental Research Funds of CAF (CAFYBB2018SY013)
  • Fundamental Research Funds (CAFYBB2017ZA004-8)