Esophageal cancer methylation prognosis markers and application thereof
1. A marker combination of a panel of markers for the prognosis of esophageal cancer methylation, said marker combination comprising 2, 3, 4 of the following methylation sites: cg02370667, cg23378365, cg06090867 and cg 03244277.
2. The marker combination of claim 1 wherein the cg02370667 is more methylated in the patient, the cg23378365 is more methylated in the patient, the cg06090867 is more methylated in the patient and the cg03244277 is more methylated in the patient.
3. The marker combination of claim 1 wherein the esophageal cancer includes but is not limited to esophageal squamous carcinoma, esophageal adenocarcinoma, esophageal lymphoma, esophageal leiomyosarcoma, and esophageal metastatic cancer;
preferably, the esophageal cancer is esophageal squamous carcinoma.
4. The marker combination of claim 1 wherein the prognostic indicator includes objective remission rate, overall survival, progression free survival, time to disease progression, disease free survival, time to treatment failure, response rate, complete response, partial response.
5. The marker combination of claim 4 wherein the prognostic indicator is overall survival.
6. The marker combination of claim 5 wherein the prognostic indicator is overall survival within 100 months;
preferably, overall survival within 60 months;
more preferably, 1 year, 3 years, 5 years of overall survival.
7. A kit for predicting the prognosis of a patient with esophageal cancer, said kit comprising a methylation detection reagent for detecting at least one methylation site in the combination of markers of claim 1.
8. The kit of claim 7, wherein the methylation detection reagent comprises a reagent and/or an apparatus used in any one or more of the following methylation detection methods, the methylation detection methods comprising: whole genome bisulfite sequencing, pyrosequencing, bisulfite sequencing, methylation specific polymerase chain reaction, bisulfite specific polymerase chain reaction, methylation sensitive restriction enzyme-PCR/Southern method, bisulfite-binding restriction enzyme method, digital polymerase chain reaction, restriction landmark genome scanning, CpG island microarray, methylation profiling, methylation chip detection.
9. The kit of claim 7, further comprising reagents for processing the sample;
preferably, the sample comprises: blood, cells, tissue, urine, saliva, semen, milk, cerebrospinal fluid, tears, sputum, mucus, lymph, cytosol, ascites, amniotic fluid.
10. Use of a methylation detection reagent for detecting at least one methylation site in a combination of markers according to claim 1 or a kit according to claim 7 for the manufacture of a product for predicting the prognosis of a patient with esophageal cancer.
Background
Esophageal cancer, which originates in the esophageal mucosal epithelium and is a common malignant tumor of the digestive system worldwide, has a mortality rate in the sixth place and a morbidity rate in the eighth place (Liang et al, 2017; Siegel et al, 2016). Esophageal cancer includes two major histological subtypes, squamous cell carcinoma and adenocarcinoma (Rustgi and El-Serag,2014), with approximately 80% of esophageal cancer cases occurring in underdeveloped countries and 60% of these cases occurring in china (Ferlay et al, 2010). In recent years, Esophageal Adenocarcinoma (EAC) has been distributed mainly in western countries with rapidly increasing incidence, while Esophageal Squamous Cell Carcinoma (ESCC) has occurred mainly in some countries in east asia and africa (Brown et al, 2008; Zeng et al, 2016). Esophageal squamous carcinoma accounts for about 80% of the total number of esophageal carcinoma worldwide (Kamangar et al, 2006), occurring mainly in china, with about 35 million people dying from this malignancy each year (Wang et al, 2014). As with many other tumors, early diagnosis of esophageal squamous carcinoma provides a great improvement in patient prognosis. Due to the lack of effective early diagnosis, most patients with esophageal squamous carcinoma are in advanced disease when diagnosed, and even more than half of the patients have distant metastasis when diagnosed, resulting in a five-year survival rate of less than 30% (Besharat et al, 2008; Enzinger and Mayer, 2003). The pathogenesis of esophageal squamous carcinoma is not fully elucidated at present, which is a major bottleneck affecting the treatment thereof.
Tumors are very complex pathological processes which are involved in various factors and accumulated through multiple stages of changes, and relate to multi-level complex regulation such as genome change, epigenetic change, transcriptome change, signal pathway disorder and the like. Previously, various genome-wide or exome sequencing studies have comprehensively summarized the changes at the genome level in esophageal squamous cell carcinoma, while epigenetic studies have shown multiple molecular level changes such as DNA methylation, histone acetylation, RNA editing, and the like. In addition, epidemiological studies suggest that it may be related to certain environmental factors as well as lifestyle such as smoking and drinking (Islami et al, 2011; Morita et al, 2010; Toh et al, 2010; Zhang et al, 2012). Epigenetic changes are easily influenced by environmental factors to change genetic materials, so that the influence of dual factors of environment and heredity is gathered. In recent years, the role of epigenetic changes in the development of malignant tumors has been increasingly emphasized, and in accordance with other tumors, the genome of esophageal squamous carcinoma contains abnormal local hypermethylated regions and extensive hypomethylated regions, and the aberration of the epigenome jointly causes the development and development of esophageal squamous carcinoma, such as epigenetic silencing of cancer suppressor genes, super-enhancer activation and RNA editing (Lin et al, 2018).
With the development of detection techniques, changes in DNA methylation in tumors are considered as potential targets for the development of effective diagnostic and prognostic biomarkers. Thus, there are currently a number of studies reporting on biomarkers based on DNA methylation, including diagnostic prediction of various tumors such as lung, colorectal and gastric cancers (Okugawa et al, 2015; Tahara and Arisawa, 2015; Walter et al, 2014), and some methylation markers have been commercialized.
Disclosure of Invention
In order to provide a more accurate and efficient marker for predicting the prognosis of esophageal cancer patients, the inventor collects medical records and samples of 91 patients, constructs a prognosis model through single-factor cox regression and multi-factor cox regression analysis after methylation sequencing analysis is performed on cancer tissues and tissues beside the cancer, and verifies the model by using data of a network database.
In a first aspect, the present invention provides a set of esophageal cancer methylation prognostic markers in combination comprising 2, 3, 4 of the following methylation sites: cg02370667, cg23378365, cg06090867 and cg 03244277.
Specifically, cg02370667 is 84029511 th position on chromosome 16, and the gene at this position is nebab 2, which belongs to CpG islands in the methylation site classification.
The cg23378365 is 156696351 th on chromosome 5, and the gene at the site is CYFIP2, which belongs to S _ Shelf (2-4 kbp downstream of CpG island) in the methylation site classification.
The cg06090867 is position 20511782 on chromosome 1, and the gene at this site is UBXN10, which belongs to N _ Shore (0-2 kbp upstream of CpG island) in the methylation site classification.
The cg03244277 is position 75310450 on chromosome 4, and the gene at this site is AREG, which belongs to N _ Shore (0-2 kbp upstream of CpG island) in the methylation site classification.
In particular, the cg02370667 is highly methylated in patients and the prognosis of patients is worse
The cg23378365 methylation degree in the patients is high, and the prognosis of the patients is worse
The cg06090867 has high methylation degree in patients, and poorer prognosis of patients
The cg03244277 is highly methylated in patients, and the prognosis of patients is worse.
Preferably, the esophageal cancer includes, but is not limited to, esophageal squamous carcinoma, esophageal adenocarcinoma, esophageal lymphoma, esophageal leiomyosarcoma, and esophageal metastatic cancer.
Preferably, the esophageal cancer is esophageal squamous carcinoma.
Preferably, the combination may also include other methylation sites.
Preferably, the additional sites include sites located on structural genes and/or non-structural genes.
Preferably, the non-structural gene comprises a cis-acting element and/or a trans-acting element.
Preferably, the combination of esophageal cancer methylation prognosis markers can also be used in combination with other esophageal cancer prognosis markers.
Preferably, the indicators of prognosis include Objective Remission Rate (ORR), Overall survival Rate (Overall survival Rate, OS), progression-free survival (PFS), Time To Progression (TTP), Disease-free survival (DFS), time to failure to treat (TT F), Response Rate (RR), Complete Response (CR), Partial Response (PR).
Preferably, the indicator of prognosis is Overall survival (Overall survival rate, Overall survival, OS).
Preferably, the indicator of prognosis is overall survival within 100 months. Preferably, overall survival rates within 60 months. More preferably, overall survival within 50 months.
Preferably, the prognostic indicator is an overall survival rate which may also be within 1 to 5 years. Specifically, the 1-5 years are 1 year, 2 years, 3 years, 4 years and 5 years.
In another aspect, the invention provides a kit for predicting the prognosis of esophageal cancer, wherein the kit comprises a methylation detection reagent for detecting at least one marker in the combination.
Preferably, the methylation detection reagent comprises a reagent used in any one or more of the following methylation detection methods, including: whole Genome Bisulfite Sequencing (WGBS), Pyrosequencing (Pyrosequencing), Bisulfite sequencing, Methylation-Specific polymerase chain reaction (Methylation-Specific PCR), Bisulfite-Specific polymerase chain reaction, Methylation-sensitive Restriction enzyme-PCR/Southern method, Bisulfite-bound Restriction enzyme assay (COBRA), digital polymerase chain reaction, Restriction landmark genome scanning, CpG island microarray, single nucleotide primer extension SNUPE, Methylation Profiling (Methylation Profiling), reagents for one or more of the methods on a Methylation chip.
Preferably, the kit further comprises reagents for processing the sample.
Preferably, the treatment may comprise a step of extracting DNA, a step of converting cytosine to uracil.
Preferably, the reagent used in the step of converting cytosine to uracil is most commonly a bisulphite reagent.
Preferably, the bisulfite reagent includes a bisulfite buffer and a protection buffer.
Preferably, the bisulfite is selected from one or more of sodium bisulfite, sodium sulfite, sodium bisulfite, ammonium bisulfite and ammonium sulfite. DNA treated with bisulfite reagents, whose unmethylated cytosine nucleotides will be converted to uracil, while methylated cytosines and other bases remain unchanged, can thus distinguish, for example, between methylated and unmethylated cytidine in CpG dinucleotide sequences.
Preferably, the extraction reagent for extracting DNA may include a lysis buffer, a binding buffer, a washing buffer, and an elution buffer.
Preferably, the lysis buffer comprises a protein denaturant, a detergent, a pH buffer and a nuclease inhibitor.
Preferably, the binding buffer comprises a protein denaturant and a pH buffer.
Preferably, the detergent includes, but is not limited to, Tween20, IGEPAL CA-630, Triton X-100, NP-40, and SDS.
Preferably, the pH buffer comprises one or more of Tris, boric acid, phosphate, MES and HEPES.
Preferably, the nuclease inhibitor comprises one or more of EDTA, EGTA and DEPC.
Preferably, the sample comprises: blood, esophageal epithelial cells, tissue, urine, saliva, semen, milk, cerebrospinal fluid, tears, sputum, mucus, lymph, cytosol, ascites, amniotic fluid.
Preferably, the blood comprises plasma, serum or whole blood.
Preferably, the sample is a tissue.
Preferably, the kit can also comprise instruments and/or reagents required for detecting clinical indexes.
Preferably, the clinical index detection includes basic tests such as blood routine, liver and kidney function, blood sugar, electrolytes.
In another aspect, the invention provides a methylation detection reagent for detecting at least one methylation prognosis marker in the combination, and application of the kit capable of predicting esophageal cancer prognosis in predicting esophageal cancer prognosis.
In another aspect the present invention provides a method of predicting the prognosis of esophageal cancer, said method comprising the step of detecting the degree of methylation of at least one methylation prognosis marker in the aforementioned combinations in a subject.
Preferably, the subject is a patient diagnosed with esophageal cancer, and the method can stratify the subject into high/low risk, representing the patient's prognostic status.
The term "subject" as used herein refers to any animal (e.g., a mammal), including but not limited to humans, non-human primates, rodents, etc., that will be the recipient of a particular treatment. In general, the terms "subject" and "patient" are used interchangeably herein when referring to a human subject.
Preferably, the subject is a human.
Preferably, the method comprises qualitatively, quantitatively, or semi-quantitatively.
Drawings
FIG. 1 is a graph of the model of the invention and the results of survival analysis of each methylation site in a collected patient sample and TCGA dataset, A is the analysis of a collected patient sample using the model of the invention, B is the analysis of a database using the model of the invention, C is the analysis of a collected patient sample using cg02370667, D is the analysis of a collected patient sample using cg23378365, E is the analysis of a collected patient sample using cg06090867, and F is the analysis of a collected patient sample using cg 03244277.
FIG. 2 is a ROC plot using patient information to validate the model of the present invention.
FIG. 3 is a ROC plot using database information to validate a model of the present invention.
Detailed Description
The present invention will be further described with reference to the following examples, which are intended to be illustrative only and not to be limiting of the invention in any way, and any person skilled in the art can modify the present invention by applying the teachings disclosed above and applying them to equivalent embodiments with equivalent modifications. Any simple modification or equivalent changes made to the following embodiments according to the technical essence of the present invention, without departing from the technical spirit of the present invention, fall within the scope of the present invention.
Example 1 data Collection and construction of prognostic models
Study object
During the period from 2010 to 2014, we collected 91 patients with esophageal squamous carcinoma from the tumor hospital of the Chinese medical academy of sciences and the tumor hospital of Zhejiang province. The local, ethnic, sex, diagnosis age, drinking, smoking, tumor occurrence and clinical stage of all subjects were obtained from the medical records of each patient. The patients in this study group all had known informed consent, and the ethical review committees of the tumor hospital of the Chinese medical academy of sciences and the tumor hospital of Zhejiang province have approved the relevant studies.
We performed ESCC patient clinical staging interpretation on the seventh edition AJCC standard, defining the smoking and drinking status of patients according to the following criteria: persons who smoke <1 cigarette per day and have a duration of <1 year are considered non-smokers, and vice versa; the person who drinks more than or equal to 2 times per week and drinks more than or equal to 1 year is judged to be drunk, and the person who does not drink is judged to be drunk otherwise. We completed patient survival follow-up by: the last time when the study subject was followed by the patient was 2018, 11 months, and the total duration was 5 years.
We judge the pathological type of the patient by taking clinical pathological diagnosis report as a standard. None of the study patients had been treated with chemotherapeutic drugs or radiation prior to surgery. After esophageal cancer resection, we selected cancer tissue and paracancerous tissue (5 cm away from the tumor site margin) from each patient for subsequent study. We proceed the grouping and screening of samples according to the strict process, and the distribution of the basic clinical data of the subject selected into patients is shown in Table 1.
TABLE 1 distribution of clinical pathological data of esophageal squamous carcinoma patients
Upper segment: 20-25 cm; middle section: 25-30 cm; the following steps: 30-40 cm.
#Staging of tumor TNM was assessed according to esophageal cancer AJCC, seventh edition.
Tumor cell content identification
First, we obtained cancer and paracancerous tissues cryopreserved at-80 ℃ for patients enrolled; then, using a freezing embedding medium to process unfrozen tissues in time, and carrying out frozen section after the embedding medium is fixed; then, H & E staining was performed according to the laboratory routine procedure, and the stained sections were mounted with neutral gum; finally, we selected more than two pathologists to judge cancer cell content to satisfy the following two rules: (1) cancer cell content in cancer tissue is more than or equal to 70 percent, and (2) cancer cells are not contained in tissues beside the cancer.
Methylation sequencing and analysis
Extracting DNA, and confirming that the DNA can meet the quality requirement of subsequent DNA methylation detection by means of NanoDrop 2000 detection, Qubit detection, running electrophoresis and the like. After sulfite transformation of the DNA sample, Methylation sequencing was performed using Illumina 450K Methylation chip (Illumina Human Methylation 450K beacon chip).
The Illumina 450K methylation chip contains 485,512 methylation sites in total, covers 99% of encoding genes, and also comprises other genome positions: (1) 96% CpG islands; (2) sites other than CpG islands; (3) non-CpG sites present in stem cells; (4) sites where there is a difference between normal tissue and various tumor tissues; (5) FANTOM 4 promoter; (6) dnase hypersensitive sites; (7) a miRNA promoter region. The detection accuracy of the 450K chip has been independently verified by two research institutes (Bibikova et al, 2011; Sandoval et al, 2011).
Data analysis
A Wilcoxon rank-sum test was chosen to perform a statistical test of matched samples of DNA methylation levels in cancer and paracancer samples, identifying 35,577 differential methylation sites (FDR values <0.05,. DELTA.. beta. > 0.2). In order to fully explore the efficacy of the DNA methylation markers in the prognosis prediction of esophageal squamous cell carcinoma, we constructed a strict statistical process for identifying the prognostic markers based on the differential methylation sites:
(1) screening methylation sites significantly correlated with patient survival based on one-way cox regression (P < 0.05);
(2) based on the multifactor cox regression, 4 methylation sites (corrected age, sex, smoking, drinking and TNM stage) which are obviously related to the survival of the patient are further screened out (P <0.05), the detailed information of the screened methylation sites is shown in the following table 2, the information of coefficients, P values, risk ratios and confidence intervals is shown in the following table 3,
TABLE 2 methylation site information from multifactorial cox regression screens
TABLE 3 information of coefficients, P-values, Risk ratios, confidence intervals
Markers
coefficients
P_value
HR
Lower limit of 95% CI
Upper limit of 95% CI
cg06090867
0.56500274
0.03749288
1.75945261
1.03323766
2.99609046
cg03244277
0.81793722
0.00279055
2.26582113
1.32545928
3.87333315
cg23378365
0.58735626
0.03652733
1.79922543
1.03749569
3.12021745
cg02370667
0.78070752
0.00526101
2.18301624
1.26160139
3.77738956
(3) We stratify patients prognostically with the sum of the products of methylation levels at each site and the natural logarithm of the respective risk ratios (HR) (specific formula: cg02370667 × 0.7807075+ cg23378365 × 0.5873563+ cg06090867 × 0.5650027+ cg03244277 × 0.8179372) as predictive scores;
(4) prognostic scores were calculated in our and TCGA datasets (download website: http:// gdac. bronadustute. org/, 450K methylated chip data containing 95 esophageal squamous carcinoma and 14 paracarcinoma samples in total) and survival analysis was performed, with the results of the analysis in the patient samples shown in FIG. 1A and the results of the analysis in the database shown in FIG. 1B.
Furthermore, using the collected patient data, survival analysis was performed for each methylation site individually to explore the possibility of each methylation site individually as a prognostic marker, the results for cg02370667 are shown in fig. 1C, the results for cg23378365 are shown in fig. 1D, the results for cg06090867 are shown in fig. 1E, and the results for cg03244277 are shown in fig. 1F.
(5) The accuracy of the model of the invention in prediction at 1 year, 3 years and 5 years is verified again; the ROC curve verified by using the patient information is shown in FIG. 2, and the ROC curve verified by using the database information is shown in FIG. 3, which illustrate that the model provided by the invention can predict the prognosis of the esophageal cancer patient.