Genome wide association study of response to interval and continuous exercise training: the Predict-HIIT study

Background Low cardiorespiratory fitness (V̇O2peak) is highly associated with chronic disease and mortality from all causes. Whilst exercise training is recommended in health guidelines to improve V̇O2peak, there is considerable inter-individual variability in the V̇O2peak response to the same dose of exercise. Understanding how genetic factors contribute to V̇O2peak training response may improve personalisation of exercise programs. The aim of this study was to identify genetic variants that are associated with the magnitude of V̇O2peak response following exercise training. Methods Participant change in objectively measured V̇O2peak from 18 different interventions was obtained from a multi-centre study (Predict-HIIT). A genome-wide association study was completed (n = 507), and a polygenic predictor score (PPS) was developed using alleles from single nucleotide polymorphisms (SNPs) significantly associated (P < 1 × 10–5) with the magnitude of V̇O2peak response. Findings were tested in an independent validation study (n = 39) and compared to previous research. Results No variants at the genome-wide significance level were found after adjusting for key covariates (baseline V̇O2peak, individual study, principal components which were significantly associated with the trait). A Quantile–Quantile plot indicates there was minor inflation in the study. Twelve novel loci showed a trend of association with V̇O2peak response that reached suggestive significance (P < 1 × 10–5). The strongest association was found near the membrane associated guanylate kinase, WW and PDZ domain containing 2 (MAGI2) gene (rs6959961, P = 2.61 × 10–7). A PPS created from the 12 lead SNPs was unable to predict V̇O2peak response in a tenfold cross validation, or in an independent (n = 39) validation study (P > 0.1). Significant correlations were found for beta coefficients of variants in the Predict-HIIT (P < 1 × 10–4) and the validation study (P <  × 10–6), indicating that general effects of the loci exist, and that with a higher statistical power, more significant genetic associations may become apparent. Conclusions Ongoing research and validation of current and previous findings is needed to determine if genetics does play a large role in V̇O2peak response variance, and whether genomic predictors for V̇O2peak response trainability can inform evidence-based clinical practice. Trial registration Australian New Zealand Clinical Trials Registry (ANZCTR), Trial Id: ACTRN12618000501246, Date Registered: 06/04/2018, http://www.anzctr.org.au/Trial/Registration/TrialReview.aspx?id=374601&isReview=true.


Background
Cardiorespiratory fitness (CRF) is measured by peak oxygen uptake (VȮ 2 peak) during a graded exercise test, and is strongly associated with a reduced risk of cardiometabolic diseases and mortality [1]. Improving VȮ 2 peak can generally be achieved by regular endurance exercise training, in a dose-dependent manner [2]. Data typically supports the notion that a higher dose of exercise (volume and intensity) will elicit greater VȮ 2 peak gains [3][4][5][6][7]. Interval training, such as sprint interval training (SIT) and high-intensity interval training (HIIT) have shown comparable [8] and greater [9][10][11][12][13] group mean VȮ 2 peak changes, respectively, compared with moderate-intensity continuous training (MICT). However, there is considerable inter-individual variability in observed VȮ 2 peak improvements following apparently similar exercise training [7,14]. Identifying the genetic and environmental determinants that can predict exercise response may pave the way to personalised exercise programs that can maximise health outcomes.
An early genome wide association study (GWAS) using data from the HEalth, RIsk factors, exercise Training And GEnetics (HERITAGE) Family Study reported that 21 variants contributed to 49% of the variance in VȮ 2 peak response [15]. However, very few of these variants have been replicated in further testing or other studies suggesting that the variants identified in the HERITAGE study were overfitted to the specific population. In a recent systematic review, we identified 35 studies describing 15 cohorts that found 97 possible variants associated with VȮ 2 peak training response [16]. Only 13 genetic variants were replicated by more than two authors [15,[17][18][19][20][21][22][23][24][25], and none reached genome-wide significance. A lack of replication and significance in previous research is mostly likely due to underpowered studies that have predominantly been candidate-gene focused [26,27]. Furthermore, a comparator arm is necessary to discriminate true inter-individual variability from random and technical variability, yet very few studies included such a group, nor did they investigate or control for population stratification. This evidence to-date questions the validity of using currently available commercial genetic tests to prescribe exercise interventions.
Larger sample sizes are needed to build upon current research and to overcome random error in VȮ 2 peak measurement at the individual level. Greater collaboration between research centres using a discovery driven approach free from pre-existing bias is warranted [26]. VȮ 2 peak response between different population groups and training interventions along with assessing how individual factors modulate response, should also be explored [28]. The aim of this study was to use one of the largest cohorts to-date (multi-centre Predict-HIIT [7] study) to complete a GWAS to investigate genetic variants associated with VȮ 2 peak response following exercise training interventions. In addition, we attempted to replicate candidate variants from previous studies, and aimed to build and validate a genetic prediction model for VȮ 2 peak response (polygenic predictor score, PPS) based on the genetic data.

Discovery cohort-'Predict HIIT'
Predict-HIIT participant characteristics, recruiting and study intervention details have been previously outlined [7]. Ethical approval was obtained from the Bellberry ethical committee at the University of Queensland (#2016-02-062-A-1), and from all the institutions involved. Participant data was collated from 18 exercise training interventions across eight universities from three continents. As outlined in our previous paper [7], participant change in objectively measured VȮ 2 peak (indirect calorimetry from a graded exercise test to volitional fatigue on a treadmill or cycle ergometer) was obtained following high-volume HIIT (sessions contained ≥ 15 min of highintensity efforts in total, n = 225), low-volume HIIT/SIT (sessions contained < 15 min of high-intensity efforts in total, n = 76), or MICT (sessions contained 30 + minutes of continuous exercise at 64-76% maximum heart rate, n = 206). The characteristics of the 507 participants from predominantly European descent used in our GWAS are outlined in Table 1 (24% female, age 55.9 ± 16.9 years, 83% with pathologies and/or elderly). The deoxyribonucleic acid (DNA) extraction, preparation and genotyping are outlined below. These details varied based on the study site where the sample was collected, and whether DNA extraction and/or genotyping had already been completed prior to this study. Our quality control measures have limited bias associated with different DNA preparation, extraction methods and genotyping.

Validation cohort-'Improve-HIIT'
For replication of our results, we utilised the unpublished findings from an independent study recently performed in our laboratory (Improve-HIIT). The 'Improve-HIIT' study examined the response to highvolume HIIT by randomly allocating 40 sedentary (< 1 h of structured exercise each week) but apparently healthy Caucasian adults (age 18-50) to one of two groups: (i) 6 weeks of supervised high-volume HIIT (5 min warm up, 4 min 90-95% heart rate maximum followed by 3 min recovery repeated 4 times, 3×/ week) + prebiotic fibre (oligofructose-enriched inulin) supplementation (12 g/day) or (ii) 6 weeks of supervised high-volume HIIT (3×/week) + placebo (maltodextrin) supplementation (12 g/day). There was no difference in the average VȮ 2 peak response, or the inter-individual variability in VȮ 2 peak response between study groups; as such, this study was deemed appropriate for validating findings from the Predict-HIIT GWAS. Ethical approval was obtained from the Institutional Human Research Ethics Approval committee at the University of Queensland (#2018000398).
Each participant completed a series of tests and several measures were collated before and after the intervention. Tests relevant to this analysis included the completion of an incremental VȮ 2 peak test to exhaustion on a treadmill (Ramped Bruce Protocol) using indirect calorimetry (Parvo Medica True One 2400 System, Parvo Medics, Inc., Sandy, UT, USA) before and after the intervention period, and provision of a saliva sample for genetic analysis (Oragene DNA collection kit, DNA Genotek, Ontario, Canada).
Genotyping, imputation and quality control were completed with the same protocol as for the Predict-HIIT cohort. One sample was removed due to high missing genotyping rate, leaving 39 samples for further analysis. VȮ 2 peak response (post intervention VȮ 2 peak-pre intervention VȮ 2 peak) was calculated for each participant. Fibre/placebo supplement, age, sex, body fat percentage and baseline VȮ 2 peak were not correlated with response and were not included as covariates for analysis. Using PLINK, the top ranked loci (P < 1 × 10 −5 ) from the Predict-HIIT study were compared in the Improve-HIIT study. Lower ranking Table 1 Genome-wide association study participant characteristics. Mean ± standard deviation TEMs were slightly different for each training intervention and have been outlined in Table 3 Technical error of measurement (TEM) = multiplying mean VȮ 2peak value by a previously published coefficient of variation for VȮ 2peak of 5.6%, Minimal Clinically Important Difference (MCID) = 3.5 mL/kg/min, Polygenic Predictor Score (PPS). *Significant difference between high-volume HIIT & low-volume HIIT/SIT (P < 0.05), ** Significant difference between high-volume HIIT, MICT & low-volume HIIT/SIT (P < 0.05) loci (P < 1 × 10 −4 ) were also examined between cohorts (see Table 2 for study characteristics).

DNA preparation DNA extraction from whole blood
Genomic DNA from 58 whole blood samples [29] was extracted using a QIAamp DNA blood midi kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. The DNA samples were quantified using a Qubit fluorometer 3.0 and all samples were diluted to 100 ng/µL for genotyping.

DNA extraction from buffy coat
DNA from 93 buffy coats [30] was extracted using a QIAsymphony DSP DNA Mini Kit according to manufacturer's instructions [31]. The purified genomic DNA was stored at −20 degrees until genotyped.

Data quality control
Genotypes at individual SNPs from all cohorts were merged according to the manifest and plink files. Quality control was completed separately on individual cohorts, and included assessment of missingness by individual (threshold < 5%), missingness by genotype (threshold < 5%), Hardy-Weinberg equilibrium in controls (P < 1 × 10 −6 ), extreme heterozygosity (threshold > 3 standard deviations from mean) and identity by descent threshold at 0.2 of PI_HAT score (n = 13 excluded individuals). For each pair of related samples (PI_ HAT > threshold), the sample with the higher missingness rate was removed (n = 3 excluded individuals). Along with quantitative GWAS analysis, we further defined groups based on their relative change in VȮ 2 peak (mL/ kg/min) for additional comparisons. Samples were classified as a 'likely-responder' , 'likely non-responder/adverse responder' and 'uncertain responder' based on their relative change in VȮ 2 peak (mL/kg/min) following training. As outlined in Williams et al. [7], a likely responder achieved a VȮ 2peak response above one minimal clinically important difference (3.5 mL/kg/min) associated with a 10-25% improvement in survival over a 10-year period, plus one technical error of measurement (average baseline VȮ 2 peak multiplied by coefficient variation of 5.6%; calculated for each study). This high threshold for response was used to increase the confidence in the 'likely responder'/'likely non-responder' classification. The thresholds are provided in Table 3.
SNPs with Minor Allele Frequency (MAF) > 0.05 were then used to perform principal component analysis (PCA) for ethnicity identification using SHELLFISH [45]. Ethnic and ancestry outliers (more than 6 standard deviations from the mean on either of the two first principal components (PCs)) were excluded (n = 10). Then, data was imputed with the Haplotype Reference Consortium (HRC) reference panel 1.1 [45] using the Sanger imputation server. SNPs with low imputation quality (INFO score ≤ 0.6) were excluded from further analysis. In total, 26 samples were removed due to large ethnicity deviations from the group, leaving 507 samples for association testing (Table 1). Genomic inflation factor λ and quantile-quantile (Q-Q) plots were used to compare the genome-wide distribution of the test statistic with the expected null distribution. The genomic inflation factor λ is defined as the median of the observed chi-squared test statistic divided by the expected median of the corresponding chi-squared distribution. A λ close to 1 reflects no evidence of inflation, while values up to 1.10 are generally considered acceptable for a GWAS. Baseline VȮ 2 peak, the individual study and PC6 (the 6th principal components from the PCA analysis, which was significantly associated with the phenotype) were included as covariates.

Statistical analysis VȮ 2 peak response
Normality for VȮ 2 peak was assessed using the Shapiro-Wilk test. An analysis of variance was used to compare average group VȮ 2 peak response between training interventions (high-volume HIIT, MICT, low-volume HIIT/ SIT). Variability in response was measured by the range of responses for each intervention. A chi-squared test was used to compare the proportion of likely responders, likely non-responders and those participants classified as uncertain between training groups.

Association testing of independent VȮ 2peak responses
Similar to previous studies in this area [15] investigating polygenic phenotypes (i.e. VȮ 2 peak trainability), we used a quantitative approach rather than a case-control analysis to identify variants associated with VȮ 2 peak response. Association testing was conducted in PLINK [46], using a linear regression. Baseline VȮ 2 peak, the individual study and the PC6 from the principal component analysis were found to be significantly associated with VȮ 2 peak response and were included as covariates in analysis. Age and sex were not associated with the trait. Our findings did not change when age and sex were also included in the association analysis. Thus, we included covariates based on a posteriori instead of a priori knowledge. Association analyses of imputed SNPs was assessed with PLINK best-guess genotypes. Genome-wide significance was set at the standard GWAS threshold of P < 5 × 10 −8 and suggestive significance was set at P < 1 × 10 −5 . The single most significant SNP (the lead SNP) was used to represent each of the loci. The cluster plots of the genotyped lead SNP, or supported genotyped SNPs of imputed lead SNP, were checked manually to eliminate poor signals. An analysis of covariance was used to compare the average VȮ 2 peak response for each genotype of the top-ranked SNPs, including baseline VȮ 2 peak, the individual study and PC6 as covariates. Statistical analysis was completed using SPSS (version 23.0, SPSS Inc., Chicago, IL, USA).

Polygenic predictor score
A polygenic predictor score (PPS) was calculated for each participant using the beta coefficient of the selected SNPs. The PPS was an extension of the 'summary predictor score' outlined by Bouchard et al. [15] using data from the HERITAGE study. In our study, we sought to improve on this model to ensure the 'high response training alleles'/'effect' alleles were weighted by the effect size (beta coefficient) derived from our GWAS (see Eq. 1).
where i is the index of the SNP in k selected SNPs used to calculate the PPS. β i is the effect size (beta coefficient of linear regression) of SNPi in the PPS model, n i is the number of effect alleles of SNPi.
Scores were then added across k SNPs to yield a final PPS and a comparison was made between likely responders, likely non-responders and those deemed uncertain (as described earlier). To avoid over-training (inability of model to be generalised to new data), we did a tenfold cross-validation (using MultiBLUP [47]) with the discovery cohort samples and merged the results of 10 test folds for the analysis. The tenfold cross validation was to test the PPS model's ability to predict VȮ 2 peak response in new data not related to the development of the PPS model internally.

Replication of candidate loci
The 97 loci identified as candidate loci for VȮ 2 peak response in our recent systematic review [16] were analysed and compared with the top-ranking loci (α < 1 × 10 −5 ) from the Predict-HIIT study. Lead SNPs from all associated loci were used to calculate the PPS, as well as the 97 genetic variants found previously, were mapped to the nearest gene and submitted as a batch query to the ToppGene pathway analysis software [48]. Biological processes and pathways that appeared in both groups were selected. Genetic variants were also submitted to the GTEx Portal to identify if any SNPs were expressive quantitative trait loci (eQTL) [49].
No SNPs reached the typical threshold for genomewide significance (P < 5 × 10 -8 ). The Q-Q plot and a genomic inflation factor of 1.002 indicated there was very minor inflation in the study (i.e. population stratification or DNA sample quality), and minor overdispersions of test statistics when compared to the null distribution (Fig. 2). Twelve loci were associated with VȮ 2 peak response at suggestive significance (P < 1 × 10 -5 , Fig. 3 and Table 4). The most significant association was found for rs6959961 near the membrane associated guanylate kinase, WW and PDZ domain containing 2 (MAGI2) gene (P = 2.61 × 10 -7 ). Homozygotes for the response allele (TT, n = 93) had a 2.4 mL/kg/min greater (P = 2.8 × 10 -7 ) average VȮ 2 peak response than those homozygote for the non-response allele (CC, n = 152) and a 1.3 mL/ kg/min greater (P = 0.002) average VȮ 2 peak response than heterozygotes (TC, n = 262). The second most significant association (P = 2.75 × 10 -7 ) was found for rs730747755 near the Unc-80 Homolog, NALCN Channel Complex Subunit (UNC80) gene. Homozygotes for the response allele (AA, n = 66) had a 2.6 mL/kg/min greater (P = 1.2 × 10 -7 ) average VȮ 2 peak response than those homozygote for the non-response allele (GG, n = 229), and a 1.8 mL/kg/min greater (P = 2.5 × 10 -4 ) average VȮ 2 peak response than heterozygotes (AG, n = 212). A tenfold cross validation found the Pearson correlation coefficient between subject polygenic predictor score (PPS) and VȮ 2 peak response (likely responder, likely non responder or uncertain) was not significant (R 2 = 0.027, P-value = 0.76, see Fig. 4). Similarly, the PPS was not able to predict VȮ 2 peak training response in the validation (Improve-HIIT) cohort (R 2 = 0.001, P = 0.8). None of the 12 lead SNPs from our GWAS had a P-value < 0.05 in the Improve-HIIT study. Furthermore, from the 992 variants with a P-value < 1 × 10 -4 in the Predict-HIIT cohort, a correlation of beta coefficients in the discovery (Predict-HIIT) cohort and the Improve-HIIT cohort was found to be significant (R 2 = 0.156, P-value = 7.62 × 10 -7 ). This suggests these variants in the Improve-HIIT cohort have a significant similar trend of effect as they do in the Predict-HIIT cohort.
Whilst none of the 12 lead SNPs from our GWAS validated SNPs found in previous research, several of our lead 12 SNPs were found near genes that are in similar biological pathways and processes to predictor genes found in previous research (Table 5). Additionally, we were able to validate a number of SNPs from previous research at a nominal level (4 SNPs at P-value < 0.05, see Table 6). Furthermore, we found several SNPs to be eQTL in tissues that may influence training adaptations. For example, rs11647343, is an eQTL of zinc finger DHHC-type palmitoyltransferase 7(ZDHHC7) in whole blood (P = 1.8 × 10 -5 ). The SNP, rs2657147, is The GCTA power calculator found a cohort of 2960 samples would have 80% power to detect a quantitative trait with a true heritability of 30%.

Discussion
To our knowledge this is one of the largest multi-centre GWAS to investigate CRF response following exercise training. Compared to previous genetic studies in this field of research, we were able to user newer technology and methodologies that increased the validity and accuracy of our results. Across the 507 participants and irrespective of the intervention completed, there was large variability in individual VȮ 2 peak response to highvolume HIIT, MICT and low-volume HIIT/SIT. We were unable to identify genetic variants at a genome-wide significant level that explained this variability in response to each training intervention. However, 12 SNPs were found at a suggestive level of significance and warranted further investigation. Several of our lead SNPs seemed possible candidate genes for predicting VȮ 2 peak response due to their association with previously identified predictor genes, and related biological pathways and processes that may influence training adaptations.
The most significantly associated SNP, MAGI2, can influence neuronal cell activin-mediated signalling, and may supress AKT Serine/Threoine Kinase 1 (AKT1) activation [51]. AKT1 is a VȮ 2 peak response predictor gene identified from previous research, and is one of three genes from the protein kinase B family that can influence growth, differentiation and metabolism [52]. The SNP, rs1130214, found near AKT1, was significantly associated (P < 0.05) with VȮ 2 peak response in previous research [52] and was found at a nominal level in our Predict-HIIT cohort (P = 0.06). One of our lead SNPs, rs79687662, is found near the IQ Motif Containing GTPase Activating Protein 1 (IQGAP1) gene. IQGAP1 and AKT1 genes are both found in the E-cadherin signalling in the nascent adherens junction biological pathway, and together with Transcription Factor Hypoxia-Inducible Factor-1 (HIF1A) and Neuropilin 2 (NRP2) (predictor genes from previous research), are found in the signalling events mediated by the Vascular Endothelial Growth Factor Receptor 1 (VEGFR1) and Vascular Endothelial Growth Factor Receptor 2 (VEGFR2) biological pathway. Furthermore, a rat model A G 0.14 found the catenin (cadherin-associated protein) gene was upregulated in higher responders to HIIT, which helps to regulate angiogenesis, neurogenesis and tissue development [53]. Another recent rodent study found loss of Iqgap1 may lead to defective AKT and Extracellular Signal-Regulated Kinase 1/2 (ERK1/2) signalling and impaired cardiomyocyte hypertrophy [54]. Our second strongest associated lead SNP, rs730747755, is found near the Unc-80 Homolog, NALCN Channel Complex Subunit (UNC80) gene. UNC80 is a gene that contributes to a large ion channel complex (the 'NALCN channelosome"), which includes the Sodium Leak Channel, Non Selective (NALCN) gene [55]. NALCN is a VȮ 2 peak response predictor gene found in previous research [18], and similar to the UNC80 gene, may influence the resting membrane potential of neuronal cells [55]. There is evidence that genes encoding the NALCN channelosome may contribute to the susceptibility for several diseases, including cardiac diseases, some cancers and psychiatric disorders [56].
Two of our associated lead SNPs were found near genes related to peroxisome proliferator-activated (PPAR) activity. The SNP rs14932370 is found near the ASXL Transcriptional Regulator 1 (ASXL1) gene. Overexpression of ASXL1 may reduce adipogenesis by decreasing Peroxisome Proliferator-Activated Receptor y (PPARy) activity [57]. The SNP, rs2236368, is found near the Transcription Factor EB (TFEB) gene. TFEB may regulate mitophagy, and in addition to Peroxisome Proliferator-Activated Receptor Gamma, Coactivator 1 alpha (PGC-1α), is considered important for mitochondrial biogenic regulation [58]. TFEB may also regulate Fig. 4 Tenfold cross validation-no correlation between Polygenic Predictor Score (PPS) and VȮ 2 peak response (R 2 = 0.027, P = 0.76). Red, green and blue dots represent likely non-responders, likely responders and uncertain responders, respectively insulin sensitivity, glucose homeostasis and lipid oxidation [59]. Overexpression of TFEB may increase mitochondrial biogenesis and ATP production in skeletal muscle, independently from PCG-1α [59]. A study also found PGC-1α expression can be increased through the dephosphorylation and nuclear translocation of TFEB [60]. With these points in mind, a recent study found completing two high-intensity exercise sessions within a short time frame (2 h) increased the nuclear abundance of TFEB and the transcription of PCG-1α Table 5 Gene interactions and common biological pathways and processes between this study and previous findings Bolded genes from input from the Predict-HIIT cohort. All other genes from previous research

Biological pathways P-value Genes from input
Signaling events mediated by VEGFR1 and VEGFR2 2.31 × 10 -4 HIF1A, IQGAP1, NRP2, AKT1 Cell adhesion molecules (CAMs) 3.76 × 10 -3 CLDN14, NLGN1, ITGB8 CD6 E-cadherin signaling in the nascent adherens junction 1.05 × 10 -2 IQGAP1, AKT1  in 8 healthy young men [61]. Furthermore, overexpression of PGC-1α has been associated with improved VȮ 2 peak at baseline and following endurance training in several studies [62,63]. Several of our other associated SNPs are found in the same biological processes and pathways to variants identified in previous research [16]. The SNP rs73193458 is found near the Claudin 14 (CLDN14) gene, and together with previously identified VȮ 2peak response predictor genes (Neuroligin 1 (NLGN1), Integrin Subunit Beta 8 (ITGB8) and Cluster of Differentiation 6 (CD6)) is involved in the cell adhesion molecules (CAMs) biological pathway. The SNP rs11874598 found near the Piezo Type Mechanosensitive Ion Channel Component 2 (PEIZO2) gene (which is a mechanosensitive ion channel involved in touch, proprioception, and respiratory function [64]); and rs11647343 found near the ATPase Secretory Pathway Ca2 + Transporting 2 (ATP2C2) gene (which is related to nucleotide binding and calcium transporting ATPase activity and cardiac conduction [64]); along with several other genes identified from previous studies, are involved in cation transmembrane transport biological processes. We also found rs11647343 and rs2657147 to be eQTLs of genes associated with whole blood (ZDHHC7) and subcutaneous tissue (TRIM68), respectively [49]. In mice, ZDHHC7 plays a role in glucose transporter type 4 (Glut4) palmitoylation, contributing to glucose homeostasis [65], possibly contributing to metabolic adaptions required for VȮ 2 peak improvements. TRIM68 variants have been associated with early onset obesity [66] and is upregulated following aerobic exercise [67]. TRIM68 is associated with ubiquitination [67] and may potentially play a role in proteolytic activity and exercise induced muscle damage. Body composition has been associated with exercise capacity, including maximal workload and oxygen uptake [68,69]. However, more work is needed to go beyond association and to identify causal variants/genes. Future functional studies are needed.

Validation
We created a PPS from the top-12 associated loci from the discovery cohort to identify who was more likely to be a responder or non-responder to different forms of training. It was hypothesised that those identified as a lower responder may need a greater training dose than reported in our study to elicit a clinically meaningful response, or other environmental influences may need to be considered. For example, Montero and Lundby [4] have shown non-responders can become responders by increasing the dose of exercise training. Despite many of the suggestively associated SNPs showing a strong connection to previously identified genes, processes and pathways that may influence training adaptions to exercise, there was no significant correlation between the PPS score and VȮ 2 peak response following the tenfold crossvalidation. The variants and model could not accurately explain the variance in VȮ 2 peak response or predict who may be a lower or higher responder to each of the training interventions. Likewise, an independent cohort validation, from the Improve-HIIT study, did not support an association of the lead 12 SNPs with variance in VȮ 2 peak response when considered individually, or in the PPS model. This may be due to a relatively small sample size, or in fact that genetics plays a smaller role than previous research has alluded to. Our power calculation found we need at least 2960 samples to detect signals of common variants with a heritability of 30%.
Additionally, we were unable to replicate variants (VȮ 2 peak response predictor genes) identified from previous research [16] at a genome wide or suggestive level of significance. However, we were able to replicate several genetic variants from previous research at a nominal significance level within the Predict-HIIT cohort, including: rs10751308, rs10921078, rs1535628 and rs2003298 (P < 0.05). Two of these SNPs (rs10921078 and rs2003298) had the same 'response' allele in the Predict-HIIT cohort and previous research [15]. These SNPs warrant investigation in future studies. Furthermore, a significant Pearson correlation coefficient was found for the beta coefficient of variants with a P-value < 1 × 10 -4 in the Predict-HIIT cohort the Improve-HIIT cohort. This indicates general effects of the loci (as a group) exist, and a larger sample size may detect many of these effects as statistically significant.

Limitations
Several limitations may have prevented the finding of more significant associations and validating the proposed PPS model. Firstly, VȮ 2 peak response is considered a complex trait that may result from multiple interactions between genes (epistasis) and epigenetic changes that can affect gene expression [27]. This was made evident with several of the lead Predict-HIIT SNPs sharing common biological pathways and processes to predictor genes identified from previous studies. Larger sample sizes than reported in our study (tens of thousands) are often needed to investigate these gene interactions via a GWAS, and to identify rare variants that may be contributing to overall response [70,71]. A lack of detail in previous publications prevented some select variants from being replicated. Previous studies have predominantly been candidate-gene focused, and similar to our study, have lacked the necessary statistical power [16]. The validation study also lacked statistical power and the population studied was different to the Predict-HIIT study.
The Predict-HIIT study included a mix of healthy, young, older and clinical European population groups from studies with a variety of exercise doses; whereas the validation study was a high volume HIIT intervention on young, healthy but inactive predominantly Caucasian adults, and included a nutrition intervention. Previous studies have predominantly investigated endurance interventions, including participants from a mix of nationalities, and mainly inactive but healthy populations [16]. Moreover, there may be differences in the accuracy of findings between studies based on participant compliance to study protocols. These factors may have influenced the gene expression and the significance of variants discovered in previous research, the validation study and the Predict-HIIT study. If our study had a larger sample size, we could have stratified our analysis to see if associations were different when clustered according to healthy and clinical populations, and training doses. We tried to combat this by including significant covariates in our GWAS model, including the individual study.
Despite limited research, the declining cost of genetic testing has created an abundance of direct-to-consumer (DTC) DNA testing companies [71]. These testing companies often base recommendations on single or very few genes. For example, Alpha-actinin-3 (ACTN3) is a common 'fitness gene' found in many DTC tests, whereby consumers are encouraged to modify the intensity, volume or frequency of their exercise training to suit their ACTN3 genotype. Potential ACTN3 'genotype-based training protocols' for strength and endurance training improvements have been outlined in previous research [72]. For VȮ 2 peak improvement, the authors suggest RR allele homozygotes and RX allele heterozygotes are resistant to muscle damage and better suited to HIIT; whereas XX allele homozygotes have lower skeletal muscle function and poorer recovery, and subsequently are better suited to MICT [72]. Our analysis and other research has shown that exercise-related phenotypes, such as change in VȮ 2 peak response, is a polygenic trait where multiple genes influence various cellular pathways [3,14] and each gene may contribute only a small percentage to the overall change [16,26,73]. We have established that the significance of these genes and associated variants remain uncertain, questioning the importance of genetics in predicting individual response and the validity of commercial tests reliant on limited variants used for personalised exercise prescription.
An even larger study with more participants is needed to advance this field of research. In other areas of genetic research, this is achieved by combining datasets and completing a meta-analysis of many genome-wide association studies. As outlined by Zeggini and Ionnidis [74], combining many GWAS datasets would require a consortium with various institutions and laboratories combining to develop a robust protocol that addresses selection bias, quality control, heterogeneity of populations studied and the replication of biologically plausible previous findings. Based on our findings, we have calculated that at least 2960 participants would need to be included in well-controlled exercise interventions to measure VȮ 2 peak response. Future research should also focus on more than just the genome by using epigenomics, transcriptomics and metagenomics. Having large datasets with this information may help to identify with greater confidence the gene and pathway interactions, and epigenetic changes resulting from environmental influences [75]. Analysis of how exercise dose and quantitative traits including diet, sleep, recovery between training sessions, clinical conditions (e.g. coronary artery disease, type 2 diabetes) and how the microbiome may affect epistasis could also be explored. The Athlome Project Consortium is a collaborative initiative between several institutions to find genetic variants associated with athletic performance [76]. A similar concept could be developed specifically for finding genetic variants associated with VȮ 2 peak response in non-athletes to aerobic training interventions.

Conclusions
In conclusion, we found 12 novel genetic variants associated with VȮ 2 peak response in the Predict-HIIT study group. These SNPs have common biological pathways and processes to previous research findings , but could not be replicated in a small independent study. Furthermore, cross-validation found the PPS created from the top-associated SNPS did not show significant correlation with whom was likely to be a responder or nonresponder to exercise training. Heterogeneity and a lack of power in the discovery (Predict-HIIT) and validation (Improve-HIIT) cohorts may have prevented lead SNPs from being reproduced between studies. Our results highlight the possible risks associated with predictive scores for complex traits. Larger sample sizes with wellprescribed, controlled and accurately measured exercise interventions are required to identify rare variants, gene interactions and epigenetic changes that may influence gene expression and VȮ 2 peak response, and to find the ideal exercise dose to negate non-response. Ongoing research and validation of current and previous findings is needed to confirm if genetics does play a large role in VȮ 2 peak response variance, and whether genomic predictors for VȮ 2 peak response trainability can inform evidence-based clinical practice.