Browse Articles

Article|02 May 2020|OPEN
A high-quality reference genome of wild Cannabis sativa
Shan Gao1 , Baishi Wang1 , Shanshan Xie2 , Xiaoyu Xu1 , Jin Zhang1 , Li Pei1 , Yongyi Yu2 , Weifei Yang2 and Ying Zhang,1
1Institute of Forensic Science, Ministry of Public Security, No. 17 South Muxidi Lane, Xicheng District, Beijing 100038, China
2Beijing Century Legend Bioscience Co., Ltd., Beijing 102300, China

Horticulture Research 7,
Article number: 73 (2020)
doi: https://doi.org/10.1038/s41438-020-0295-3
Views: 1085

Received: 07 Jan 2020
Revised: 19 Mar 2020
Accepted: 19 Mar 2020
Published online: 02 May 2020

Abstract

Cannabis sativa is a well-known plant species that has great economic and ecological significance. An incomplete genome of cloned C. sativa was obtained by using SOAPdenovo software in 2011. To further explore the utilization of this plant resource, we generated an updated draft genome sequence for wild-type varieties of C. sativa in China using PacBio single-molecule sequencing and Hi-C technology. Our assembled genome is approximately 808 Mb, with scaffold and contig N50 sizes of 83.00 Mb and 513.57 kb, respectively. Repetitive elements account for 74.75% of the genome. A total of 38,828 protein-coding genes were annotated, 98.20% of which were functionally annotated. We provide the first comprehensive de novo genome of wild-type varieties of C. sativa distributed in Tibet, China. Due to long-term growth in the wild environment, these varieties exhibit higher heterozygosity and contain more genetic information. This genetic resource is of great value for future investigations of cannabinoid metabolic pathways and will aid in promoting the commercial production of C. sativa and the effective utilization of cannabinoids. The assembled genome is also a valuable resource for intensively and effectively investigating the C. sativa genome further in the future.