A prognostic predictor panel with DNA methylation biomarkers for early-stage lung adenocarcinoma in Asian and Caucasian populations
- I-Ying Kuo†1, 6,
- Jayu Jen†1, 6,
- Lien-Huei Hsu2, 3,
- Han-Shui Hsu4,
- Wu-Wei Lai5Email author and
- Yi-Ching Wang1, 6Email author
© The Author(s). 2016
Received: 12 May 2016
Accepted: 18 July 2016
Published: 2 August 2016
The incidence of lung adenocarcinoma (LUAD) is increasing worldwide with different prognosis even in early-stage patients. We aimed to identify a prognostic panel with multiple DNA methylation biomarkers to predict survival in early-stage LUAD patients of different racial groups.
The methylation array, pyrosequencing methylation assay, Cox regression and Kaplan-Meier analyses were conducted to build the risk score equations of selected probes in a training cohort of 69 Asian LUAD patients. The risk score model was verified in another cohort of 299 Caucasian LUAD patients in The Cancer Genome Atlas (TCGA) database.
We performed a Cox regression analysis, in which the regression coefficients were obtained for eight probes corresponding to eight genes (AGTRL1, ALDH1A3, BDKRB1, CTSE, EFNA2, NFAM1, SEMA4A and TMEM129). The risk score was derived from sum of each methylated probes multiplied by its corresponding coefficient. Patients with the risk score greater than the median value showed poorer overall survival compared with other patients (p = 0.007). Such a risk score significantly predicted patients showing poor survival in TCGA cohort (p = 0.036). A multivariate analysis was further performed to demonstrate that the eight-probe panel association with poor outcome in early-stage LUAD patients remained significant even after adjusting for different clinical variables including staging parameters (hazard ratio, 2.03; p = 0.039).
We established a proof-of-concept prognostic panel consisting of eight-probe signature to predict survival of early-stage LUAD patients of Asian and Caucasian populations.
Lung cancer is the leading cause of cancer-related deaths with an increasing incidence of lung adenocarcinoma (LUAD) subtype worldwide . Prognosis may vary in patients with the same stage tumor because cancer is characterized by genetic, epigenetic, and phenotypic changes that result in a tremendous variability in clinical behavior [2, 3]. Therefore, the development of additional molecular markers for survival prediction of LUAD is required.
DNA methylation, which usually occurs in CpG dinucleotides, is a major epigenetic modification in mammalian genome [4–6]. High-throughput methylation arrays are now available to determine DNA methylation levels of thousands of CpG sites, simultaneously [7–9]. This technology enables large-scale DNA methylation analysis to identify informative DNA methylation biomarkers in lung cancer [7, 10–16]. Many reports have demonstrated that each cancer subtypes such as lung adenocarcinoma and squamous cell carcinoma has its own methylation signature [12, 13, 15, 16].
Therefore, in the current study we focus on the development of survival predictors in early-stage LUAD patients by performing genome-wide methylation analysis and pyrosequencing quantitative methylation assay to select eight DNA methylation probes in a training cohort of 69 patients recruited in Taipei Veterans General Hospital (TVGH). We also included certain clinical parameters that are known to affect prognosis [2, 3, 17–19] along with the selected eight-probe panel to the Cox regression analysis. The relevance of our finding has been validated in a cohort of 299 patients as part of The Cancer Genome Atlas (TCGA) project.
Patients and tissue samples
Characteristics of the lung adenocarcinoma patients included in the current study
N = 69 (100 %)
N = 299 (100 %)
< 65 year-old
25 (36.2 %)
124 (41.5 %)
≥ 65 year-old
44 (63.8 %)
166 (55.5 %)
13 (18.8 %)
113 (37.8 %)
42 (60.9 %)
104 (34.8 %)
4 (5.8 %)
32 (10.7 %)
10 (14.5 %)
50 (16.7 %)
16 (23.2 %)
122 (40.8 %)
51 (73.9 %)
159 (53.2 %)
2 (2.9 %)
18 (6.0 %)
56 (81.2 %)
236 (78.9 %)
12 (17.4 %)
57 (19.1 %)
69 (100 %)
299 (100 %)
0 (0.0 %)
0 (0.0 %)
60 (87.0 %)
8 (11.6 %)
1 (1.4 %)
51 (73.9 %)
15 (21.7 %)
62 (89.9 %)
6 (8.7 %)
Genomic DNA extraction and sodium bisulfite conversion
Genomic DNA from primary tumor tissue samples of 69 patients from TVGH were extracted using proteinase K digestion and phenol-chloroform extraction. A total of 1 μg genomic DNA was used for bisulfite conversion using the EpiTect Bisulfite kit (Qiagen, Duesseldorf, Germany) according to the manufacturer’s protocols.
The genome-wide methylation analysis platform
The Illumina Infinium HumanMethylation27 BeadChip (27,578 CpG dinucleotides for 14,495 genes) was adapted for DNA methylation detection according to manufacturer’s manual. DNA methylation levels were reported as β-values by calculating the ratio of intensities between locus-specific methylated and unmethylated bead-bound probes. The β-value is a continuous variable, ranging from 0 (unmethylated) to 1 (fully methylated). The methylation array data can be viewed online under GEO accession number GSE83845.
To quantify cytosine methylation in individual CpG sites of candidate methylation probes identified by methylation array, bisulfite-converted DNA was analyzed using a pyrosequencing system (PyroMark Q24, Qiagen, Hilden, Germany). Specific pyrosequencing primer and PCR primer were designed for “target” CpG sites in the probes to be analyzed. Pyrosequencing was carried out in accordance with the manufacturer’s protocol (Qiagen). The target CpG sites were evaluated by converting the resulting pyrograms to numerical values for peak heights. Primer sequences are listed in Additional file 1: Table S1, and the genomic map of the detected CpG sites are shown in Additional file 1: Figure S1.
Data processing and statistical analysis
Receiver operating characteristic (ROC) curve analysis was performed to determine the accuracy of the established CpG panel [area under the curve (AUC), sensitivity, and specificity]. The univariate and multivariate Cox regression analyses were conducted to explore the relationship between patient survival and several explanatory variables for defining the hazard ratio (HR) and confidence intervals (CI) of cancer death risk of variables using the Statistical Package for the Social Sciences version 17.0 (SPSS Inc., Headquarters Chicago, IL, USA). Overall survival curves were calculated according to the Kaplan-Meier method. p < 0.05 was considered statistically significant.
Marker discovery in genome-scale DNA methylation dataset
Univariate Cox model for 34 probes in the training cohort of LUAD by pyrosequencing methylation assay
Sensitivity and specificity of the eight selected probes by ROC analysis
The risk score calculation and survival prediction of the eight-probe panel by Kaplan-Meier method
In the clinical validation phase, we first built the risk score for the eight selected methylation probes using the multivariate Cox regression analysis in the TVGH training cohort of 69 early-stage LUAD patients. These DNA methylation probe covariates were weighted by the regression coefficients to calculate the coefficient and hazard ratio for each patient. The risk score for each patient was derived from sum of methylation value of each probe multiplied by the corresponding coefficient, as following equation: risk score = AGTRL1 methylation value × (-0.015) + ALDH1A3 methylation value × (-0.023) + BDKRB1 methylation value × (-0.034) + CTSE methylation value × (0.022) + EFNA2 methylation value × (0.010) + NFAM1 methylation value × (-0.017) + SEMA4A methylation value × (-0.012) + TMEM129 methylation value × (-0.006). Example of risk score calculation for two patients is shown in Additional file 1: Figure S2.
We further applied our risk score model to determine whether our finding could be validated in another cohort of 299 early-stage LUAD patients whose follow-up data were available in TCGA project, and methylation level was also determined by the Infinium Methylation array. The risk score calculated with the median value (as 0.47) classified the 299 TCGA patients into two groups (upper panel, Fig. 4b). Such a calculation predicted a subset of patient with a high risk score showing poorer survival with MST of 50.9 months (middle panel, Fig. 4b) with statistical significance (lower panel, Fig. 4b). These results indicated that the prognostic predictor panel consisting of the selected eight-probe showed a strong prediction value in the TCGA validation cohort.
Univariate and multivariate Cox regression analysis of the eight-probe panel
Univariate and multivariate Cox regression analyses of risk factors for cancer-related death in early-stage LUAD patients
TVGH (N = 69)a
TCGA (N = 299)a
HR (95 % CI)b
HR (95 % CI)b
HR (95 % CI)b
HR (95 % CI)b
Risk < Median
Risk > Median
To further define the prognostic effects of the eight-probe panel in early-stage LUAD patients, univariate and multivariate Cox regression analyses were performed in the TCGA validation cohort of 299 early-stage LUAD patients. Univariate Cox regression analysis revealed that patients with the risk score > median of the eight-probe panel had poor outcome, with a relative risk of death of 1.66 (p = 0.038) (Table 3). However, the eight-probe panel showed a borderline significance by the multivariate analysis in the TCGA cohort.
The incidence of LUAD is increasing worldwide . Patients with the same stage of lung cancer may have different prognosis . Development of prognostic markers is especially important in the patients with early-stage lung cancer, in whom clinical oncologists need selection factors to decide whether adjuvant therapy is necessary. In the present study, we develop a prognostic predictor panel for early-stage LUAD patients. This panel consists of eight DNA methylation probes corresponding to eight specific genes, including AGTRL1, ALDH1A3, BDKRB1, CTSE, EFNA2, NFAM1, SEMA4A, and TMEM129. The risk score calculated using the eight-probe panel served as an independent prognosis biomarker by Cox regression model and the multivariate analysis in our recruited patients. Therefore, the risk scores calculated from this eight-probe panel are valuable biomarkers for prognostic evaluation for early-stage LUAD patients to be tested in other cohorts.
Recently, Heller et al. identified a total of 12 genes that were differentially methylated in tumors compared with surrounding tissues in stage I, II or III Caucasian non-small cell lung cancer patients. Among the 12 genes, only the methylation patterns of HOXA2 and HOXA10 were independent prognostic factors in lung squamous cell carcinoma patients . In addition, Esteller and the associates used methylation array to establish methylation profiles of stage I Caucasian non-small cell lung cancer and identified that methylation of two or more genes in HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9 correlated with an increased risk of cancer recurrence . Interestingly, HOXA9 promoter methylation was associated with high risk in stage I LUAD patients of two independently cohorts by another study . To date, all studies that have been executed in an attempt to find markers for clinical use do not include patients from different racial groups. In our study, the prognostic predictor panel comprising eight DNA methylation biomarkers was an independent risk factor of poor outcome in Asian LUAD patients. We further applied our risk score model to determine whether our finding could be validated in another cohort of 299 early-stage LUAD patients whose follow-up data were available in TCGA project. The new coefficient and hazard ratio were defined according to the methylation value of the eight probes given in TCGA database of these patients. The Kaplan-Meier overall survival analysis showed that TCGA patients with risk score greater than median value had a shorter MST compared with other patients (Fig. 4b). However, the result of multivariate Cox regression was only close to significance in the Caucasian LUAD patients (Table 3). One of the limitations of the current TCGA study is that we are unable to acquire the data on treatment or surgery performed on the TCGA patients (Table 1). We believe that these results could be improved after including data from more patients when they are available in TCGA dataset or by validating in other cohorts of Caucasian LUAD patients.
The identification of the eight probes that can predict the clinical outcome in patients may reveal causes of the cancer development and tumorigenesis. For example, Angiotensin II receptor-like 1 (AGTRL1) and Bradykinin receptor B1 (BDKRB1) are G-protein-coupled receptors (GPCRs). GPCRs, which represent by far the largest family of cell-surface molecules involved in signal transduction, have recently emerged as crucial players in tumor growth and metastasis . AGTRL1 is Apelin receptor. Apelin is an angiogenic factor secreted by tumor cells in order to promote the formation of new vessels necessary for tumor growth . In addition, crosstalk between BDKRB1 and EGFR has been shown to maintain tumor growth in the breast cancer . Aldehyde dehydrogenase 1 family, member A3 (ALDH1A3) is the retinoic acid biosynthesis enzyme, and plays a major role in the detoxification of aldehydes generated by alcohol metabolism and lipid peroxidation. Promoter hypermethylation of ALDH1A3 has been reported to be a prognostic marker for lung cancer, gastric cancer, and invasive bladder cancer [27–30]. Cathepsin E (CTSE) prevents tumor growth and metastasis by catalyzing the proteolytic release of soluble trail from tumor cell surface . Ephrin A2 (EFNA2), which belongs to ephrins family, regulates cell adhesion, motility, survival, proliferation, and differentiation. Semaphorins 4A (SEMA4A) suppresses endothelial cell migration and proliferation in vitro and angiogenesis in vivo mediated by vascular endothelial growth factor . Further characterization of the probes validated in our panel could help to dissect the mechanism of LUAD tumorigenesis and progression.
The advantages of our prognostic predictor panel are as follows. First, the methylation level of the eight probes could be analyzed by DNA methylation array or pyrosequencing in patients. Second, the stepwise multivariate Cox regression analysis, in which the coefficients were obtained for the selected eight probes, could generate the risk score equations specifically for the cohort of patients to be tested. Third, any newly recruited patients could be assigned into risk groups once the risk score equations are determined. Therefore, the prognostic predictor panel could calculate the risk score not only in the Asian but also in the Caucasian LUAD patients. However, some technical limitations such as sample collection and preprocessing as well as experimental procedures of DNA methylation array or pyrosequencing assay need to be controlled to avoid batch effects. In addition, clinical variables such as adjuvant therapy and surgical methods may affect outcome prediction. Large-scale, multicenter and prospective studies are necessary to validate our risk score model in early-stage LUAD patients.
Our study provides a proof-of-concept prognostic prediction panel consisting of eight methylated probes that are closely associated with survival in the early-stage LUAD patients. This prediction panel could be useful in stratifying patients according to the Cox-model and risk score before further treatment for early-stage LUAD patients who in dire need of intensive care.
AGTRL1, Angiotensin II receptor-like 1; ALDH1A3, Aldehyde dehydrogenase 1 family member A3; BDKRB1, Bradykinin receptor B1; CI, confidence intervals; CTSE, Cathepsin E; EFNA2, Ephrin A2; HR, hazard ratio; LUAD, lung adenocarcinoma; MST, median survival time; NFAM1, NFAT activating protein with ITAM motif 1; SEMA4A, Semaphorin 4A; Superpc, supervised principal components; TCGA, the cancer genome atlas; TMEM129, Transmembrane protein 129; TVGH, Taipei Veterans General Hospital
The authors thank Ms. Ching-Hsi Lin, Mr. Chi-Huei Hsiung, and Mr. Chien-Hsun Lin for technical support.
This study was supported by Taiwan National Science Council (98-3112-B-006-014-CC1, 99-3112-B-006-013-CC1), Taiwan Ministry of Science and Technology (103-2627-B-006-007), and Taiwan Ministry of Health and Welfare (105-TDU-B-211-124-003).
Availability of data and materials
Data and materials related to this work are available upon request.
IYK performed the experiments. IYK and JJ did the data analysis in this study. LHH, HSH, WWL provided clinical samples. IYK, JJ and YCW wrote the paper. All authors read and approved the manuscript. YCW and WWL obtained funding.
The authors declare that they have no competing interests.
Consent for publication
All authors approve the manuscript for publication.
Ethics approval and consent to participate
Surgically resected LUAD patients were recruited from TVGH, after obtaining appropriate institutional review board permission (#98-03-18A) and informed consent from the patients.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Siegel R, Naishadham D, Jemal A. Cancer statistics, 2013. CA Cancer J Clin. 2013;63:11–30.View ArticlePubMedGoogle Scholar
- Mok TS, Wu YL, Thongprasert S, Yang CH, Chu DT, Saijo N, et al. Gefitinib or carboplatin-paclitaxel in pulmonary adenocarcinoma. N Engl J Med. 2009;361:947–57.View ArticlePubMedGoogle Scholar
- Rosell R, Carcereny E, Gervais R, Vergnenegre A, Massuti B, Felip E, et al. Erlotinib versus standard chemotherapy as first-line treatment for European patients with advanced EGFR mutation-positive non-small-cell lung cancer (EURTAC): a multicentre, open-label, randomised phase 3 trial. Lancet Oncol. 2012;13:239–46.View ArticlePubMedGoogle Scholar
- Herman JG, Baylin SB. Gene silencing in cancer in association with promoter hypermethylation. N Engl J Med. 2003;349:2042–54.View ArticlePubMedGoogle Scholar
- Belinsky SA. Gene-promoter hypermethylation as a biomarker in lung cancer. Nat Rev Cancer. 2004;4:707–17.View ArticlePubMedGoogle Scholar
- Costello JF, Frühwald MC, Smiraglia DJ, Rush LJ, Robertson GP, Gao X, et al. Aberrant CpG-island methylation has non-random and tumour-type-specific patterns. Nat Genet. 2000;24:132–8.View ArticlePubMedGoogle Scholar
- Bibikova M, Lin Z, Zhou L, Chudin E, Garcia EW, Wu B, et al. High-throughput DNA methylation profiling using universal bead arrays. Genome Res. 2006;16:383–93.View ArticlePubMedPubMed CentralGoogle Scholar
- Jones PA. At the tipping point for epigenetic therapies in cancer. J Clin Invest. 2014;124:14–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Ma X, Wang YW, Zhang MQ, Gazdar AF. DNA methylation data analysis and its application to cancer research. Epigenomics. 2013;5:301–16.View ArticlePubMedPubMed CentralGoogle Scholar
- Dai Z, Lakshmanan RR, Zhu WG, Smiraglia DJ, Rush LJ, Frühwald MC, et al. Global methylation profiling of lung cancer identifies novel methylated genes. Neoplasia. 2001;3:314–23.View ArticlePubMedPubMed CentralGoogle Scholar
- Christensen BC, Marsit CJ, Houseman EA, Godleski JJ, Longacker JL, Zheng S, et al. Differentiation of lung adenocarcinoma, pleural mesothelioma, and nonmalignant pulmonary tissues using DNA methylation profiles. Cancer Res. 2009;69:6315–21.View ArticlePubMedPubMed CentralGoogle Scholar
- Son JW, Jeong KJ, Jean WS, Park SY, Jheon S, Cho HM, et al. Genome-wide combination profiling of DNA copy number and methylation for deciphering biomarkers in non-small cell lung cancer patients. Cancer Lett. 2011;311:29–37.View ArticlePubMedGoogle Scholar
- Kwon YJ, Lee SJ, Koh JS, Kim SH, Lee HW, Kang MC, et al. Genome-wide analysis of DNA methylation and the gene expression change in lung cancer. J Thorac Oncol. 2012;7:20–33.View ArticlePubMedGoogle Scholar
- Park JY, Kim D, Yang M, Park HY, Lee SH, Rincon M, et al. Gene silencing of SLC5A8 identified by genome-wide methylation profiling in lung cancer. Lung Cancer. 2013;79:198–204.View ArticlePubMedGoogle Scholar
- Heller G, Babinsky VN, Ziegler B, Weinzierl M, Noll C, Altenberger C, et al. Genome-wide CpG island methylation analyses in non-small cell lung cancer patients. Carcinogenesis. 2013;34:513–21.View ArticlePubMedGoogle Scholar
- Sandoval J, Mendez-Gonzalez J, Nadal E, Chen G, Carmona FJ, Sayols S, et al. A prognostic DNA methylation signature for stage I non-small-cell lung cancer. J Clin Oncol. 2013;31:4140–7.View ArticlePubMedGoogle Scholar
- Vaissiere T, Hung RJ, Zaridze D, Moukeria A, Cuenin C, Fasolo V, et al. Quantitative analysis of DNA methylation profiles in lung cancer identifies aberrant DNA methylation of specific genes and its association with gender and cancer risk factors. Cancer Res. 2009;69:243–52.View ArticlePubMedPubMed CentralGoogle Scholar
- Zhang R, Chu M, Zhao Y, Wu C, Guo H, Shi Y, et al. A genome-wide gene-environment interaction analysis for tobacco smoke and lung cancer susceptibility. Carcinogenesis. 2014;35:1528–35.View ArticlePubMedPubMed CentralGoogle Scholar
- Forrest LF, Adams J, White M, Rubin G. Factors associated with timeliness of post-primary care referral, diagnosis and treatment for lung cancer: population-based, data-linkage study. Br J Cancer. 2014;111:1843–51.View ArticlePubMedPubMed CentralGoogle Scholar
- Bair E, Tibshirani R. Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol. 2004;2:E108.View ArticlePubMedPubMed CentralGoogle Scholar
- Vasiljević N, Wu K, Brentnall AR, Kim DC, Thorat MA, Kudahetti SC, et al. Absolute quantitation of DNA methylation of 28 candidate genes in prostate cancer using pyrosequencing. Dis Markers. 2011;30:151–61.View ArticlePubMedGoogle Scholar
- Deppermann KM. Lung cancer screening--where we are in 2004 (take home messages). Lung Cancer. 2004;45:S39–42.View ArticlePubMedGoogle Scholar
- Robles AI, Arai E, Mathé EA, Okayama H, Schetter AJ, Brown D, et al. An integrated prognostic classifier for stage I lung adenocarcinoma based on mRNA, microRNA, and DNA methylation biomarkers. J Thorac Oncol. 2015;10:1037–48.View ArticlePubMedPubMed CentralGoogle Scholar
- Dorsam RT, Gutkind JS. G-protein-coupled receptors and cancer. Nat Rev Cancer. 2007;7:79–94.View ArticlePubMedGoogle Scholar
- Sorli SC, Le Gonidec S, Knibiehler B, Audigier Y. Apelin is a potent activator of tumour neoangiogenesis. Oncogene. 2007;26:7692–9.View ArticlePubMedGoogle Scholar
- Molina L, Matus CE, Astroza A, Pavicic F, Tapia E, Toledo C, et al. Stimulation of the bradykinin B(1) receptor induces the proliferation of estrogen-sensitive breast cancer cells and activates the ERK1/2 signaling pathway. Breast Cancer Res Treat. 2009;118:499–510.View ArticlePubMedGoogle Scholar
- Kim YJ, Yoon HY, Kim JS, Kang HW, Min BD, Kim SK, et al. HOXA9, ISL1 and ALDH1A3 methylation patterns as prognostic markers for nonmuscle invasive bladder cancer: Array-based DNA methylation and expression profiling. Int J Cancer. 2013;133:1135–42.View ArticlePubMedGoogle Scholar
- Marcato P, Dean CA, Pan D, Araslanova R, Gillis M, Joshi M, et al. Aldehyde dehydrogenase activity of breast cancer stem cells is primarily due to isoform ALDH1A3 and its expression is predictive of metastasis. Stem Cells. 2011;29:32–45.View ArticlePubMedGoogle Scholar
- Shames DS, Girard L, Gao B, Sato M, Lewis CM, Shivapurkar N, et al. A genome-wide screen for promoter methylation in lung cancer identifies novel methylation markers for multiple malignancies. PLoS Med. 2006;3:e486.View ArticlePubMedPubMed CentralGoogle Scholar
- Yamashita S, Tsujino Y, Moriguchi K, Tatematsu M, Ushijima T. Chemical genomic screening for methylation-silenced genes in gastric cancer cell lines using 5-aza-2'-deoxycytidine treatment and oligonucleotide microarray. Cancer Sci. 2006;97:64–71.View ArticlePubMedGoogle Scholar
- Kawakubo T, Okamoto K, Iwata J, Shin M, Okamoto Y, Yasukochi A, et al. Cathepsin E prevents tumor growth and metastasis by catalyzing the proteolytic release of soluble TRAIL from tumor cell surface. Cancer Res. 2007;67:10869–78.View ArticlePubMedGoogle Scholar
- Toyofuku T, Yabuki M, Kamei J, Kamei M, Makino N, Kumanogoh A, et al. Semaphorin-4A, an activator for T-cell-mediated immunity, suppresses angiogenesis via Plexin-D1. EMBO J. 2007;26:1373–84.View ArticlePubMedPubMed CentralGoogle Scholar