Have a personal or library account? Click to login
Comparative RNA-Seq analysis to understand anthocyanin biosynthesis and regulations in Curcuma alismatifolia Cover

Comparative RNA-Seq analysis to understand anthocyanin biosynthesis and regulations in Curcuma alismatifolia

Open Access
|May 2022

Figures & Tables

Figure 1

RNA-Seq data expression profiles in C. alismatifolia colour bracts. (A) Phenotypes of ‘Dutch Red’ at three typical development stages and ‘Chocolate’ at the blossomed stage. (B) Numbers of transcripts in the four bract samples. (C) Principal component analysis of the RNA-Seq data. (D) Venn diagram of the RNA-Seq data from the four bract samples. BF, blossomed flowering; HF, half-flowering; FPKM, fragments per kilobase of transcript per million mapped reads.
RNA-Seq data expression profiles in C. alismatifolia colour bracts. (A) Phenotypes of ‘Dutch Red’ at three typical development stages and ‘Chocolate’ at the blossomed stage. (B) Numbers of transcripts in the four bract samples. (C) Principal component analysis of the RNA-Seq data. (D) Venn diagram of the RNA-Seq data from the four bract samples. BF, blossomed flowering; HF, half-flowering; FPKM, fragments per kilobase of transcript per million mapped reads.

Figure 2

(A) Distributions and (B) KEGG analysis of DEGs identified from pairwise comparisons between different developmental stages and between varieties. The FDR value is the multiple hypothesis test-corrected p-value. The FDR value is in the range of [0–1]. The closer that number is to 0, the more significant the enrichment. The rich factor refers to the ratio of the number of genes among the DEGs located in a number of pathways to the total number of genes in the pathway entries in all of the annotated genes. The greater the rich factor, the greater the degree of enrichment. BF, blossomed flowering; DEGs, differentially expressed genes; FDR, false-discovery rate; HF, half-flowering; KEGG, Kyoto Encyclopedia of Genes and Genomes.
(A) Distributions and (B) KEGG analysis of DEGs identified from pairwise comparisons between different developmental stages and between varieties. The FDR value is the multiple hypothesis test-corrected p-value. The FDR value is in the range of [0–1]. The closer that number is to 0, the more significant the enrichment. The rich factor refers to the ratio of the number of genes among the DEGs located in a number of pathways to the total number of genes in the pathway entries in all of the annotated genes. The greater the rich factor, the greater the degree of enrichment. BF, blossomed flowering; DEGs, differentially expressed genes; FDR, false-discovery rate; HF, half-flowering; KEGG, Kyoto Encyclopedia of Genes and Genomes.

Figure 3

Comparisons of DEGs. (A) Heatmap comparison of DEGs in the most enriched KEGG pathways during the three development stages. (B) Heatmap comparison of DEGs in the most enriched KEGG pathways between ‘Chocolate’ and ‘Dutch Red’. DEGs, differentially expressed genes; KEGG, Kyoto Encyclopedia of Genes and Genomes.
Comparisons of DEGs. (A) Heatmap comparison of DEGs in the most enriched KEGG pathways during the three development stages. (B) Heatmap comparison of DEGs in the most enriched KEGG pathways between ‘Chocolate’ and ‘Dutch Red’. DEGs, differentially expressed genes; KEGG, Kyoto Encyclopedia of Genes and Genomes.

Figure 4

The anthocyanin biosynthesis process with its core metabolites and enzymes and the expression levels of core enzyme genes. Enzyme names and expression patterns are indicated at the side of each step. Colour boxes from left to right represent unigenes showing lower or higher expression level in the colour bracts of ‘Dutch Red’ of SF, HF, BF and YF, respectively. ANS, anthocyanidin synthase; BF, blossomed flowering; CHI, chalcone isomerase; CHS, chalcone synthase; C4H, cinnamate-4-hydroxylase; DFR, dihydroflavonol 4-reductase; F3H, flavanone-3-hydroxylase; F3′5′H, flavonoid 3′,5′-hydroxylase; HF, half-flowering; PAL, phenylalanine ammonia lyase.
The anthocyanin biosynthesis process with its core metabolites and enzymes and the expression levels of core enzyme genes. Enzyme names and expression patterns are indicated at the side of each step. Colour boxes from left to right represent unigenes showing lower or higher expression level in the colour bracts of ‘Dutch Red’ of SF, HF, BF and YF, respectively. ANS, anthocyanidin synthase; BF, blossomed flowering; CHI, chalcone isomerase; CHS, chalcone synthase; C4H, cinnamate-4-hydroxylase; DFR, dihydroflavonol 4-reductase; F3H, flavanone-3-hydroxylase; F3′5′H, flavonoid 3′,5′-hydroxylase; HF, half-flowering; PAL, phenylalanine ammonia lyase.

Figure 5

Distribution and selection of key differently expressed TFs associated with anthocyanin biosynthesis. (A) Distribution of TF family of unigenes. (B) Clustering heat maps of significantly enriched differently expressed TFs including MYB, bHLH and WD during three development stages and between species ‘Chocolate’ and ‘Dutch Red’. TF, transcription factors.
Distribution and selection of key differently expressed TFs associated with anthocyanin biosynthesis. (A) Distribution of TF family of unigenes. (B) Clustering heat maps of significantly enriched differently expressed TFs including MYB, bHLH and WD during three development stages and between species ‘Chocolate’ and ‘Dutch Red’. TF, transcription factors.

Figure 6

qRT-PCR validations of 14 putative genes involved in anthocyanin biosynthesis and regulations. The histograms represent expression determined by qRT-PCR (left y-axis), while lines represent expression by RNA-Seq in FPKM values (right y-axis). The x-axis in each chart represents the SF, HF, BF and YF, respectively. For qRTPCR assays, the mean was calculated from three biological replicates. For RNA-Seq, each point is the mean of three biological replicates. BF, blossomed flowering; HF, half-flowering.
qRT-PCR validations of 14 putative genes involved in anthocyanin biosynthesis and regulations. The histograms represent expression determined by qRT-PCR (left y-axis), while lines represent expression by RNA-Seq in FPKM values (right y-axis). The x-axis in each chart represents the SF, HF, BF and YF, respectively. For qRTPCR assays, the mean was calculated from three biological replicates. For RNA-Seq, each point is the mean of three biological replicates. BF, blossomed flowering; HF, half-flowering.

Summary of assembly results for ‘Dutch Red’ at three development stages and ‘Chocolate’ at the blossomed stage_

FeaturesSFHFBFYF
Total raw reads (Mb)45.2448.4652.9845.47
Total clean reads (Mb)44.4747.5552.1044.63
Total clean bases (Gb)6.597.057.756.61
Clean reads Q20 (%)98.2798.2398.3498.40
Clean reads Q30 (%)94.6394.5694.8395.01
Clean reads (pair reads)22.2423.7826.0522.31
Mapped reads16,378,83017,511,18719,200,45816,316,551
Mapped ratio (%)73.6673.6673.6973.13
Total number of transcripts159,687
Total number of unigenes69,453
Total sequence base197,307,177
Average length of transcripts1,236
N50 value of transcripts1,830
E90N50 value of transcripts1,921
GC (%)42.49
TransRate score0.30
BUSCO score60.1% (3.5%)

Differently expressed genes in transcription factor families of MYB and WD40_

UnigeneAnnotationFPKM-SFFPKM-HFFPKM-BFFPKM-YFRegulation
DN13707_c0_g2WD-repeat protein306.8283.4244.7226.5Whole flowering
DN14303_c2_g4WD-repeat protein90.397.097.2101.5periods
DN17014_c0_g2MYB_superfamily613.51,695.1531.6718.1
DN20674_c0_g1WD-repeat protein121.4131.1136.267.3
DN15014_c1_g1MYB_superfamily66.327.213.411.8Early flowering
DN11279_c3_g3MYB_superfamily105.164.90.838.9periods
DN19744_c1_g3MYB_superfamily160.6132.045.953.1
DN12166_c1_g8MYB_superfamily16.310.941.86.4Blossom periods in
DN12860_c2_g5MYB_superfamily4.55.222.72.8‘Dutch Red’
DN15887_c1_g3MYB_superfamily14.110.736.52.8
DN22635_c0_g1MYB_superfamily7.111.226.83.3
DN12661_c0_g1WD-repeat protein1.95.536.536.2Blossom periods
DN21970_c4_g5MYB_superfamily15.3121.8220.9298.7

Numbers of transcripts in the 12 sequenced libraries_

seq_idFPKM (<0.5)FPKM (0.5–5)FPKM (5–100)FPKM (>100)SUM (30.5)
SF_165,10761,39731,5361,64794,580
SF_262,63163,65631,7741,62697,056
SF_364,44662,14031,4441,65795,241
HF_180,44349,50827,8111,92579,244
HF_267,57459,45930,8641,79092,113
HF_365,60760,67231,6961,71294,080
BF_167,31059,40331,1701,80492,377
BF_266,61460,39430,8961,78393,073
BF_376,84851,18629,5702,08382,839
YF_176,22651,19130,4511,81983,461
YF_284,46445,41827,8061,99975,223
YF_382,75846,32528,6221,98276,929

Primers used for qRT-PCR analysis_

GenesSequences (5′-3′)Product size
ANR 12990CAGCTGAAGCAGATGCAGAGCTCCGGATTCTCCGACATTA147
ANR 13257TCACATCATCCTCACCGTGTAAGGGGCTGGAGGAGATTTA103
CHI 10857ACGGAGTTCTTCCAGAGCAATTTCTGCATCATCTCCACCA165
CHI 15898GCTCGAACACCTCCTTGAACCGACAAGTTCACGAGGATCA155
CHS 16695TGAGGGAGAACCCGAATATGGCAGAAGACGAGGTGTGTGA161
CHS 17923ATGGCCTGAAGATGGATCAGGCGTCCTCTTCATTCTCGAC186
DFR 16691TGCCCGTTTAACTCATTTCCACTGTGGAGAAGGAGCAGGA101
DFR 14193GTGGTGTTCACCTCCTCGATCGCTAGCTCTACCCCCTTCT189
F3′5′H 8362CTCCTCCTCCGCTACCTTCTGGTACATGATGGGGCCATAG160
F3′5′H 16064ATGTGGAGGTTGCAGAGCTTGCGTACGAAGGACAGGACAT80
F3H 14172ACGTTGATTGGACCGAACATCTGCGAGATCTACACGGACA186
F3H 22055GCCTCTTGCATAGCTTCACCGGTGATCCTGCCAACTCATT83
F3H 22236AAGGAGAAGTACGCGTCCAATGTATTCCTCGTTCGCCTTC184
MYB 11279TTTCTCCCATGCTGCTTTCTTCACAATGCCACGGTTAAGA160
MYB 12860ACCCTATCGTCGAACCACTGTCGACCAAGAGAGCCAACTT174
MYB 15887CATCTTCCTCCTTGGAGCTGTCTGGATCTGCGAGGAAAGT138
MYB 18449TTCCACTTGGACTTGCACTGAAGAGCTTCACGGAGACGAA110
MYB 12166GCCGTCCAAACACATCTTCTGATTTGGTCGAGCCAGAGAG85
MYB 15014ATTTCTGCAGATGGCTTGCTATCAATTGGGCATCGAGAAG99
MYB 17014CTACGGAGGAGGATGGATGAAAGTTGCCATACCCACAAGC93
MYB 19744CGAAGAACACTGCACTGGAATATCCACTGCCTCCTCCATC190
MYB 21970GCACCGGTAAGTGCGTTAATTTCGTTTCCATCCATCCATT159
MYB 22635GTCGCAATAACCGACCATCTATCATCTGCAGCCTCTTCGT125
UFGT 12119TCGGCGATCAAGGTAGATTCGGATGAAGCTGAGCAACTC137
UFGT 13065CGGAGGTAGCCATCAACAATATCCCACATGGTGGTCTTGT143
UFGT 13592CTGGAGCTAAGGAGCAAGGATACCAGGGGTAATGGGATGA118
WD 12661CAGTAGCCATGGCAGCTACAATGGCGGATGAGTACCAAAG109
WD 13707CCAAAGAAGTCCAGCTCCTGAATTTTTGGCAATGGCAGAG91
WD 14303CAGGCGATGTGAAGAATTGAGCAACTGATGGGGGTAAAGA
WD 20674TTCGTCCCTTTGAGTTTGCTATGGGCTGTAACACGTAGCC

RNA sequencing data and corresponding quality control_

SampleRaw readsRaw basesClean readsClean basesError rate (%)Q20 (%)Q30 (%)GC content (%)Accession number
BF_154,170,0308,179,674,53053,271,9407,920,660,1940.024398.3394.8148.94SRR17783056
BF_255,251,5768,342,987,97654,405,3128,096,016,3640.024198.4194.9948.87SRR17783055
BF_349,533,0347,479,488,13448,632,0067,241,216,4770.024498.2994.749.33SRR17783054
HF_147,378,5907,154,167,09046,432,3546,880,585,8400.024898.1494.3550.69SRR17783059
HF_247,325,4467,146,142,34646,485,2506,893,639,6910.024498.2794.6749.13SRR17783058
HF_350,665,6947,650,519,79449,743,0127,381,231,3840.024498.2894.6748.55SRR17783057
SF_143,599,6986,583,554,39842,859,0446,355,814,1590.024798.1694.3648.96SRR17783064
SF_244,423,3946,707,932,49443,751,4146,485,935,3960.02498.4895.1548.33SRR17783063
SF_347,696,4227,202,159,72246,810,8506,927,255,2320.024798.1694.3848.97SRR17783060
YF_145,005,9126,795,892,71244,227,3666,543,730,7850.024198.49549.34SRR17783053
YF_247,249,4167,134,661,81646,353,9186,861,606,0530.024198.4195.0550.28SRR17783062
YF_344,148,5306,666,428,03043,297,0066,412,165,2820.024198.3994.9949.7SRR17783061

Putative structural genes in anthocyanin biosynthesis identified from DEGs_

UnigeneAnnotationFPKM-SFFPKM-HFFPKM-BFLog2 Ratio (BF/SF)FPKM-YFLog2 Ratio (BF/YF)
DN17923_c1_g3CHS-like protein2,593.01,906.1185.4−3.802,667.3−3.8
DN16695_c0_g5CHS80.4760.5298.41.92,187.2−2.9
DN10857_c0_g4Probable chalcone – flavonone isomerase 3 isoform X1 (CHI)501.8344.738.0−3.7561.0−3.9
DN15898_c0_g1Chalcone – flavonone isomerase (CHI)152.8139.520.0−2.9237.0−3.6
DN14172_c2_g1F3H976.4886.8232.7−2.11,440.4−2.6
DN22055_c1_g3Flavonol synthase/F3H5.17.311.91.21.43.1
DN22236_c1_g1Flavonol synthase/F3H296.2125.13.0−6.6257.7−6.4
DN16064_c1_g1Flavonoid 3′,5′-hydroxylase 1-like (F3′5′H)1,422.91,389.61,138.9−0.364.04.2
DN8362_c0_g1Flavonoid 3′,5′-hydroxylase 1-like (F3′5′H)6.614.12.3−1.520.0−3.1
DN14193_c0_g5DFR7.58.82.4−1.618.5−2.9
DN16691_c4_g1DFR67.8259.7117.8−0.8570.8−2.3
DN12990_c1_g6Anthocyanidin reductase (ANR)64.7331.390.80.5323.2−1.8
DN13257_c0_g1Anthocyanidin reductase (ANR)27.8249.585.31.6277.8−1.7
DN12119_c3_g2Anthocyanidin 3-O-glucosyltransferase 7-like flavonoid 3-O-glucosyltransferase (UFGT)191.51222.958.1−1.7250.30.38
DN13065_c1_g1Anthocyanin 3′-O-beta-glucosyltransferase-like (UFGT)3.742.60.24−4.011.2−5.54
DN13592_c1_g3Anthocyanidin 3-O-glucosyltransferase-like (UFGT)155.842.53.1−5.67196.8−6.0

Detailed annotation of differentially expressed MYBs and WD proteins_

UnigeneHit_nameDescriptionIdentity (%)Similarity (%)KOPathsSwiss-Prot
DN13707_c0_g2XP_009415403.1WD-repeat protein8895.3gi|74698589|sp|Q9Y7K5.2|YGI3_SCHPO
DN14303_c2_g4ONM38806.1WD-repeat protein76.885.4K10752gi|22096353|sp|O22607.3|MSI4_ARATH
DN17014_c0_g2XP_009404081.1MYB_superfamily8891.6gi|115502385|sp|O80622.2|EXP15_ARATH
DN20674_c0_g1XP_010931565.1WD-repeat protein79.791.5gi|75318693|sp|O80775.2|WDR55_ARATH
DN15014_c1_g1XP_009385043.1MYB_superfamily54.963.7K14491map04075gi|75323583|sp|Q6H805.1|ORR24_ORYSJ
DN11279_c3_g3XP_017696697.1MYB_superfamily95.697.8K09422gi|75317981|sp|O22264.1|MYB12_ARATH
DN19744_c1_g3XP_009418439.1MYB_superfamily76.382.4gi|75330977|sp|Q8S9H7.1|DIV_ANTMA
DN12166_c1_g8XP_016205675.1MYB_superfamily86.292.6K09422gi|56749347|sp|Q8LPH6.1|MYB86_ARATH
DN12860_c2_g5XP_023531843.1MYB_superfamily49.859.8K09422gi|75333993|sp|Q9FKL2.1|MYB36_ARATH
DN15887_c1_g3XP_009408601.1MYB_superfamily44.755gi|75338846|sp|Q9ZQ85.2|EFM_ARATH
DN22635_c0_g1PSS31516.1MYB_superfamily66.974.4K09422gi|75335856|sp|Q9M2Y9.1|RAX3_ARATH
DN12661_c0_g1XP_009389455.1WD-repeat protein72.978.9K14963gi|82232080|sp|Q5M786.1|WDR5_XENTR
DN21970_c4_g5XP_009405310.1MYB_superfamily59.566.4gi|75330977|sp|Q8S9H7.1|DIV_ANTMA

Functional annotation of transcripts and unigenes in databases_

Transcript number (%)Unigene number (%)
NR104,303 (0.6532)34,189 (0.4923)
Swiss-Prot82,193 (0.5147)26,589 (0.3828)
Pfam73,262 (0.4588)24,101 (0.347)
COG25,088 (0.1571)7,537 (0.1085)
GO71,758 (0.4494)23,229 (0.3345)
KEGG47,453 (0.2972)14,838 (0.2136)
Total_anno1,05,993 (0.6638)35,183 (0.5066)
Total1,59,687 (1)69,453 (1)
DOI: https://doi.org/10.2478/fhort-2022-0007 | Journal eISSN: 2083-5965 | Journal ISSN: 0867-1761
Language: English
Page range: 65 - 83
Submitted on: Nov 9, 2021
|
Accepted on: Mar 13, 2022
|
Published on: May 8, 2022
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year

© 2022 Yuan-Yuan Li, Xiao-Huang Chen, Hui-Wen Yu, Qi-Lin Tian, Luan-Mei Lu, published by Polish Society for Horticultural Sciences (PSHS)
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.