Whole-exome sequencing reveals rare genetic variations in ovarian granulosa cell tumor

Ovarian granulosa cell tumor (OGCT) is a rare ovarian tumor that accounts for about 2-5% of all ovarian tumors. Despite the low grade of ovarian tumors, high and late recurrences are common in OGCT patients. Even though this tumor usually occurs in adult women with high estrogen levels, the cause of OGCT is still unknown. To screen genetic variants associated with OGCT, we collected normal and matched-tumor formalin-fixed paraffin-embedded from 11 OGCT patients and performed whole-exome sequencing using Illumina NovaSeq 6000. A total of 1,067,219 single nucleotide polymorphisms (SNPs) and 162,155 insertions/deletions (indels) were identified from 11 pairs of samples. Of these, we identified 44 tumor-specific SNPs in 22 genes and four tumor-specific indels in one gene that were common to 11 patients. We used three cancer databases (TCGA, COSMIC, and ICGC) to investigate genes associated with ovarian cancers. Nine genes (SEC22B, FEZ2, ANKRD36B, GYPA, MUC3A, PRSS3, NUTM2A, OR8U1, and KRTAP10-6) associated with ovarian cancers were found in all three databases. In addition, we identified seven rare variants with MAF ≤ 0.05 in two genes (PRSS3 and MUC3A). Of seven rare variants, five variants in MUC3A are potentially pathogenic. Furthermore, we conducted gene enrichment analysis of tumor-specific 417 genes in SNPs and 106 genes in indels using cytoscape and metascape. In GO analysis, these genes were highly enriched in “selective autophagy,” and “regulation of anoikis.” Taken together, we suggest that MUC3A is implicated in OGCT development, and MUC3A could be used as a potential biomarker for OGCT diagnosis.


INTRODUCTION
Ovarian granulosa cell tumor (OGCT) is a rare sex cord-stromal tumor that Rokitansky first described in 1855 [1]. It accounts for only 2-5% of all ovarian tumors and is estimated to occur in 0.6-1.0/100,000 women annually worldwide [2,3]. The incidence of OGCT is highest in postmenopausal women, especially between the ages of 50 and 55, and juvenile GCT occurs in <5% of pre-pubescent girls and women younger than 30 years of age [4,5]. The symptoms of OGCT are vaginal bleeding, pain abdomen, abdominal distension, menstrual abnormalities, or amenorrhea [2].
About 70-80% of OGCTs are diagnosed at stage I disease, the 10-year survival rate in stage I is 84-95%, decreases to 50-65% in stage II, and 17-33% in stages III and IV [6]. Although OGCT has a low grade, high and late recurrences are common in patients with OGCT [7]. Recurrence occurs in about 50% of patients, and it is known that 50-80% of patients die from recurrence [8]. When recurrence occurs, the prognosis for the patient is poor, and conventional chemotherapy is not effective for recurrence [9]. Although this tumor is usually known to occur in adult women with high estrogen levels, the cause of OGCT is still unclear [10].
To date, several types of research have been conducted to elucidate the pathogenesis and treatment of OGCT. Shah et al. reported that more than 95% of OGCT patients were found to have a FOXL2 c.402C>G point mutation (C134W), which is a crucial transcription factor that regulates ovarian development and function [11][12][13]. Interestingly, WGS revealed that the FOXL2 (c.402C>G) mutation is specific for OGCT but not commonly found in other cancers [12]. FOXL2 regulates the crucial signaling pathways in the ovary, such as TGF-β/BMP signaling, MAP-kinase signaling, steroid signaling, PI3K/Akt signaling, involved in cell proliferation and apoptosis [14]. 404 www.bjbms.org

Gene set enrichment analysis
We performed a gene ontology (GO) enrichment analysis of the variants with tumor-specific genes to investigate the biological relevance of the candidate genes using Metascape software ( https://metascape.org/gp/index.html). The significant gene sets are classified into three classes: biological process (BP), cellular component (CC), and molecular function (MF). In addition, to determine their biological functions related to cancer and associated pathway, we performed molecular and genetic interaction networks analysis using Cytoscape software (cytoscape_v3.7.0 and ClueGO_v.2.5.7). We used the Benjamini and Hochberg (BH) adjustment to correct the p-value in ClueGO.

Ethics statement
The study protocol was approved by the Institutional Review Board of Chungnam National University Hospital and complied with the tenets of the Declaration of Helsinki (2016-12-056).

Subjects and WES
We recruited 11 patients from the Chungnam university hospital. The mean age was 56 (range 27-78) years. The size of the tumor is an average of 10.00 cm (Table 1). We obtained 22 fresh-frozen samples, including 11 normal and matched-tumor FFPE from OGCT patients for WES. On Illumina NovaSeq 6000 platform with 150 bp paired-end reads, WES data were generated with an average of 128 Gb sequences. The post-alignment average read depth of the WES was 258X and 169.7X in tumor and normal samples, respectively. Sequencing quality for Q30 value was 91.5% and 92.8% in tumor and normal tissues, respectively (Table S1). Complete supplementary data A recent study by Alexiadis et al. reported the high frequency of the TERT g. -124C>T mutation in the recurrent adult GCT [15]. TERT encodes the catalytic subunit of telomerase involved in oncogenesis. The mutation of the TERT promoter is a biomarker for the prognosis of various cancers, including hepatocellular carcinoma, chondrosarcoma, and primary glioblastoma [16][17][18]. Despite efforts to understand the development and recurrence of OGCT, the pathogenesis is still insufficient. Here, we aimed to detect genetic variants involved in OGCT development in normal and matched-tumor tissues from 11 OGCT patients by whole-exome sequencing (WES).

Patient
A total of 11 OGCT patients from the Chungnam university hospital were included in this study. The age of patients was 27-78-years-old. Patients were diagnosed at various times by individuals from 2011 to 2017. We collected normal and matched-tumor formalin-fixed paraffin-embedded (FFPE) from 11 OGCT patients. The tumor with an average size of about 10 cm (range 3.3-22.5 cm) was collected from each of the 11 patients (Table 1). According to the manufacturer' s instructions, DNAs were extracted from twenty-two FFPE samples from 11 patients using the Maxwell 16 FFPE plus LEV DNA purification kit (Promega, USA). The paired normal tissue was the contralateral ovarian tissue from each patient. The pathologist then made a microscopic diagnosis of normal and tumor tissue using hematoxylin and eosin-stained biopsy slides.

WES and variant calling
Preparation for capturing libraries with an Agilent SureSelect Target Enrichment Kit (Agilent, USA) followed the manufacture' s protocols. The libraries were sequenced with an Illumina NovaSeq 6000 with a 2 × 150 bp paired-end read. After that, sequencing reads were aligned to the human reference genome using the Burrows-Wheeler Alignment tool (BWA 0.7.12) with -M parameters. Picard (picard-tools-1.130) was used to remove PCR duplicates, and the Genome Analysis Tool kit (GATKv3.4.0) was performed for variant calling with -T and -knownSites parameters. Here, we only used the variants more than 30 depths in coverage. Functional annotation was conducted using SnpEff (SnpEff_v4.1g) with default settings.

Cancer databases
We used The Catalogue Of Somatic Mutations In Cancer (COSMIC), International Cancer Genome Consortium (ICGC), and The Cancer Genome Atlas (TCGA) to find ovarian cancer-related genes. Three databases contain mutational signatures in the cancer genome.

Identification of OGCT related variants
A total of 1,067,219 single nucleotide polymorphisms (SNPs) and 162,155 insertions/deletions (indels) were identified from WES data of 22 samples. To identify OGCT related variants, we collected 29,998 SNPs and 3,437 indels, shared by 11 patients (Table 2; Table S2). In the variant calling step, we selected only the variants with at least 30 depth coverage to eliminate possible errors in library preparation and sequencing data production and to determine the substantial variants in OGCT. As a result, variants with 31.7 minor depth and 417 average depth were selected. To identify OGCT related variants, we identified 7,957 SNPs and 234 indels in the exonic region ( Table 2). Of these, we identified 4110 nonsynonymous variants, including missense, nonsense, and unknown variants, and 137 frameshift indels, including nonsense and unknown variants that could affect protein  ZKSCAN3 is used for the potential prognostic marker of hepatocellular carcinoma patients. ZKSCAN3 increases the expression of ITGB4 (integrin β4) binding to its promoter, resulting in promoting migration, invasion, and EMP progress. ITGB4 activates the AKT signaling pathway involved in cell proliferation [26]. Furthermore, we investigated 22 genes for whether the genes were related to OGCT using three cancer-related databases (TCGA, ICGC, and COSMIC) that contain cancer-associated genes across all cancer types. We found nine genes (SEC22B, FEZ2, ANKRD36B, GYPA, MUC3A, PRSS3, NUTM2A, OR8U1, and KRTAP10-6) in all three databases, which were highly associated with ovarian cancer (Figure 2). FEZ2 is a family of FEZ proteins involved in axonal growth in Caenorhabditis elegans. FEZ proteins are involved in neuronal development, neurological disorders, viral infection, and autophagy. FEZ1 is a tumor suppressor gene and is implicated in ovarian carcinogenesis. FEZ1 was evaluated as a prognostic and diagnostic marker for  www.bjbms.org ovarian neoplasia [27]. NUTM2A (NUT family member 2A), also known as FAM22A, reported that YWHAE-NUTM2A fusion transcript is associated with aggressive endometrial stromal sarcomas [28]. SEC22B, a member of the SEC22 family of vesicle-trafficking proteins, is involved in the membrane fusion of vesicle trafficking between the endoplasmic reticulum and Golgi apparatus, secretory autophagy, and antigen cross-presentation [29]. Several studies have reported that SEC22B is highly related to tumorigenesis that the mutations in SEC22B were found in various cancers. Interestingly, the fusion of SEC22B-NOTCH2 activates the NOTCH pathway to the proliferation and survival of tumor cells in aggressive breast cancers and mantle cell lymphoma [30][31][32].

Pathogenic variants in OGCT
To identify pathogenic variants in OGCT, we collected variants with MAF <0.05 using the 1,000 genomes project database, NHLBI exome sequencing project, and ExAC database (http://exac.broadinstitute.org). Of the 16 variants in nine genes found in three cancer databases, we identified seven nonsynonymous SNPs, except unknown variants with MAF ≤ 0.05, including five SNPs (p.Thr343Ile, p.Met357Ile, p.Glu364Ala, p.Glu364Asp, and p.Ser366Thr) in MUC3A and two SNPs (p.Ser7Asn and p.Gly8Val) in PRSS3 (Table 4). In investigating rare functional variants using the public database, Allele Frequency Aggregator, we confirmed that the seven selected variants have an infrequent MAF of 0.0092 on average in the Asian population (Table 4) [33].
PRSS3 is a member of the trypsin family of serine proteases. The serine proteases are secreted by several enzymes that promote tumor growth and metastatic progression in various cancers, including lung adenocarcinoma, prostate cancer, and pancreatic cancer [34,35]. Interestingly, the expression of PRSS3 showed a significant increase in epithelial ovarian cancer tissue compared to normal ovarian samples at mRNA and protein levels [36]. MUC3A is a member of the membrane mucin gene family that encodes secreted and membrane bounding epithelial glycoproteins and also referred to as a potent modifier of epidermal growth factor receptor and is known to lead to poor prognosis by upregulated and downregulated expression of programmed cell death-ligand 1 in non-small cell lung cancer [37]. In addition, we used SIFT and polyphen-2 program to analyze seven variants for the potentially deleterious effects. Of these, two SNPs (p.Ser7Asn and p.Gly8Val) in PRSS3 were predicted to be tolerant or benign in the SIFT and Polyphen-2 (Table 4). 5 SNP variants in MUC3A were predicted to be "unknown" in SIFT and Polyphen-2 (Table 4).

DISCUSSION
Today, high throughput next-generation sequencing (NGS) is a technology that helps make genetic testing faster and cheaper. WES, one of the NGS technologies, is widely used to investigate the genetic variations and mechanisms of rare diseases and cancers. Although exons account for only about 2% of the human www.bjbms.org genome, exons contain approximately 85% of the mutations in Mendelian disorders with significant effects [38,39].
In ovarian cancer, there are various factors to increase the risks, such as aging, obesity, hormone therapy after menopause, and smoking. However, the development and causes of OGCT is still unclear. OGCT accounts for about 5% of ovarian cancers, but the prognosis is poor due to a high recurrence rate of over 50%. Thus, early diagnosis and treatment using genetic mutations are important.
Several studies tried to understand the mechanism of occurrence for OGCTs using NGS sequencing. FOXL2 mutation (C134W) was found using whole-transcriptome pairedend RNA sequencing and whole-genome sequencing [13,40]. The FOXL2 is strongly expressed in granulosa cells as one of the earliest markers of ovarian differentiation. FOXL2 C134W mutation is the loss of function mutation that is prevalent in adult OCCT patients. Two TERT promoter mutations (C228T and C250T) might be a biomarker of OGCT using WES and targeted sequencing [15]. These two mutations are involved in telomerase activation in several cancers, including central nervous system tumors, hepatocellular carcinomas, bladder cancers, and thyroid cancers. Significantly, they are hot-spot mutations found in about 15.9% of ovarian clear cell carcinomas.
Mucins protect epithelial tissues against external environments under normal physiological conditions [41]. The mucins are a family of O-glycoproteins that play an important role in epithelial cell regeneration, cell adhesions, immune response, and cell signaling. Reduced expression levels of several mucin genes, including MUC3, MUC4, and MUC5B in patients with Crohn' s disease, suggest primary or early mucosal defect of  www.bjbms.org these genes [42]. Chauhan et al. showed that MUC13 is more overexpressed in malignant ovarian tumors than in benign ovarian tumors [43]. MUC16 is overexpressed in epithelial ovarian cancer and used as a biomarker (CA125) [44][45][46]. On the other hand, the expression of MUC3 and MUC4 was significantly reduced as the cancer stage increased [47]. MUC3A plays a role in the pathogenesis and progression of cancers [48]. Abnormal overexpression of MUC3A in clearcell renal cell carcinoma (ccRcc), breast, pancreatic, gastric, colorectal, appendiceal, and prostate cancer is associated with poor prognosis [49][50][51][52]. The abnormal expression of MUC3A is highly associated with a poor prognosis in many tumor types, although the roles of MUC3A in cancer development are not yet clear. In addition, hypomethylation contributes to the expression of MUC3A in cancer cells [53]. The methylation status of MUC3A is also utilized as an epigenetic diagnostic marker for carcinogenic risk and prognosis in cancer patients.
Gene enrichment analysis showed that tumor-specific variants are highly enriched in anoikis and autophagy pathways. Anoikis resistance represents a critical and distinguishing feature underlying the aggressiveness of ovarian cancer cells. Several studies reported that enhanced anoikis resistance is closely related to activating the Src/Akt/Erk signaling pathway, which is a critical cellular process including aggressiveness and tumorigenicity [54,55]. In addition, autophagy has been implicated in both tumor suppression and growth, and regulates oncogenic protein substrates and angiogenesis [56,57]. Autophagy can inhibit cancer by preventing angiogenesis in prostate, breast, and colon cancer cells. Cai et al. reported that a high rate of metabolism and autophagy is associated with increased anoikis resistance, and blocking these metabolic pathways significantly increases anoikis and inhibits tumor development in vitro and in vivo [54].

CONCLUSION
In summary, we identified five rare variants for the potentially deleterious effects in MUC3A though WES. Our findings suggest that MUC3A may contribute to OGCT development, although little is known about the functional role of MUC3A in cancer pathology. It also suggests that MUC3A may be used as a potential biomarker for OGCT. For this,  www.bjbms.org further investigation with more tumor samples is required to understand the development of OGCT.