We conducted imputation to the 1000 Genomes Project of four genome-wide association studies of lung cancer in populations of European ancestry (11,348 cases and 15,861 controls) and genotyped an additional 10,246 cases and 38,295 controls for follow-up. We identified large-effect genome-wide associations for squamous lung cancer with the rare variants BRCA2 p.Lys3326X (rs11571833, odds ratio (OR) = 2.47, P = 4.74 × 10(-20)) and CHEK2 p.Ile157Thr (rs17879961, OR = 0.38, P = 1.27 × 10(-13)). We also showed an association between common variation at 3q28 (TP63, rs13314271, OR = 1.13, P = 7.22 × 10(-10)) and lung adenocarcinoma that had been previously reported only in Asians. These findings provide further evidence for inherited genetic susceptibility to lung cancer and its biological basis. Additionally, our analysis demonstrates that imputation can identify rare disease-causing variants with substantive effects on cancer risk from preexisting genome-wide association study data.
Genome-wide association studies have identified hundreds of genetic variants associated with specific cancers. A few of these risk regions have been associated with more than one cancer site; however, a systematic evaluation of the associations between risk variants for other cancers and lung cancer risk has yet to be performed.
The analysis of gene-environment (G × E) interactions remains one of the greatest challenges in the postgenome-wide association studies (GWASs) era. Recent methods constitute a compromise between the robust but underpowered case-control and powerful case-only methods. Inferences of the latter are biased when the assumption of gene-environment (G-E) independence in controls fails. We propose a novel empirical hierarchical Bayes approach to G × E interaction (EHB-GE), which benefits from greater rank power while accounting for population-based G-E correlation. Building on Lewinger et al.s ( Genet Epidemiol 31:871-882) hierarchical Bayes prioritization approach, the method first obtains posterior G-E correlation estimates in controls for each marker, borrowing strength from G-E information across the genome. These posterior estimates are then subtracted from the corresponding case-only G × E estimates. We compared EHB-GE with rival methods using simulation. EHB-GE has similar or greater rank power to detect G × E interactions in the presence of large numbers of G-E correlations with weak to strong effects or only a low number of such correlations with large effect. When there are no or only a few weak G-E correlations, Murcray et al.s method ( Am J Epidemiol 169:219-226) identifies markers with low G × E interaction effects better. We applied EHB-GE and competing methods to four lung cancer case-control GWAS from the Interdisciplinary Research in Cancer of the Lung/International Lung Cancer Consortium with smoking as environmental factor. A number of genes worth investigating were identified by the EHB-GE approach.
Epidemiological studies of underground miners suggested that occupational exposure to radon causes lung cancer with squamous cell carcinoma (SCC) as the predominant histological type. However, the genetic determinants for susceptibility of radon-induced SCC in miners are unclear. Double-strand breaks induced by radioactive radon daughters are repaired primarily by non-homologous end joining (NHEJ) that is accompanied by the dynamic changes in surrounding chromatin, including nucleosome repositioning and histone modifications. Thus, a molecular epidemiological study was conducted to assess whether genetic variation in 16 genes involved in NHEJ and related histone modification affected susceptibility for SCC in radon-exposed former miners (267 SCC cases and 383 controls) from the Colorado plateau. A global association between genetic variation in the haplotype block where SIRT1 resides and the risk for SCC in miners (P = 0.003) was identified. Haplotype alleles tagged by the A allele of SIRT1 rs7097008 were associated with increased risk for SCC (odds ratio = 1.69, P = 8.2 × 10(-5)) and greater survival in SCC cases (hazard ratio = 0.79, P = 0.03) in miners. Functional validation of rs7097008 demonstrated that the A allele was associated with reduced gene expression in bronchial epithelial cells and compromised DNA repair capacity in peripheral lymphocytes. Together, these findings substantiate genetic variation in SIRT1 as a risk modifier for developing SCC in miners and suggest that SIRT1 may also play a tumor suppressor role in radon-induced cancer in miners.
Neuronal nicotinic acetylcholine receptor (nAChR) genes (CHRNA5/CHRNA3/CHRNB4) have been reproducibly associated with nicotine dependence, smoking behaviors, and lung cancer risk. Of the few reports that have focused on early smoking behaviors, association results have been mixed. This meta-analysis examines early smoking phenotypes and SNPs in the gene cluster to determine: (1) whether the most robust association signal in this region (rs16969968) for other smoking behaviors is also associated with early behaviors, and/or (2) if additional statistically independent signals are important in early smoking. We focused on two phenotypes: age of tobacco initiation (AOI) and age of first regular tobacco use (AOS). This study included 56,034 subjects (41 groups) spanning nine countries and evaluated five SNPs including rs1948, rs16969968, rs578776, rs588765, and rs684513. Each dataset was analyzed using a centrally generated script. Meta-analyses were conducted from summary statistics. AOS yielded significant associations with SNPs rs578776 (beta = 0.02, P = 0.004), rs1948 (beta = 0.023, P = 0.018), and rs684513 (beta = 0.032, P = 0.017), indicating protective effects. There were no significant associations for the AOI phenotype. Importantly, rs16969968, the most replicated signal in this region for nicotine dependence, cigarettes per day, and cotinine levels, was not associated with AOI (P = 0.59) or AOS (P = 0.92). These results provide important insight into the complexity of smoking behavior phenotypes, and suggest that association signals in the CHRNA5/A3/B4 gene cluster affecting early smoking behaviors may be different from those affecting the mature nicotine dependence phenotype.
The detection of tumor suppressor gene promoter methylation in sputum-derived exfoliated cells predicts early lung cancer. Here, we identified genetic determinants for this epigenetic process and examined their biologic effects on gene regulation. A two-stage approach involving discovery and replication was used to assess the association between promoter hypermethylation of a 12-gene panel and common variation in 40 genes involved in carcinogen metabolism, regulation of methylation, and DNA damage response in members of the Lovelace Smokers Cohort (N = 1,434). Molecular validation of three identified variants was conducted using primary bronchial epithelial cells. Association of study-wide significance (P < 8.2 × 10(-5)) was identified for rs1641511, rs3730859, and rs1883264 in TP53, LIG1, and BIK, respectively. These single-nucleotide polymorphisms (SNP) were significantly associated with altered expression of the corresponding genes in primary bronchial epithelial cells. In addition, rs3730859 in LIG1 was also moderately associated with increased risk for lung cancer among Caucasian smokers. Together, our findings suggest that genetic variation in DNA replication and apoptosis pathways impacts the propensity for gene promoter hypermethylation in the aerodigestive tract of smokers. The incorporation of genetic biomarkers for gene promoter hypermethylation with clinical and somatic markers may improve risk assessment models for lung cancer.
Genetic variants located at 15q25, including those in the cholinergic receptor nicotinic cluster (CHRNA5) have been implicated in both lung cancer risk and nicotine dependence in recent genome-wide association studies. Among these variants, a 22-bp insertion/deletion, rs3841324 showed the strongest association with CHRNA5 mRNA expression levels. However the influence of rs3841324 on lung cancer risk has not been studied in depth.
We performed a multistage genome-wide association study of melanoma. In a discovery cohort of 1804 melanoma cases and 1026 controls, we identified loci at chromosomes 15q13.1 (HERC2/OCA2 region) and 16q24.3 (MC1R) regions that reached genome-wide significance within this study and also found strong evidence for genetic effects on susceptibility to melanoma from markers on chromosome 9p21.3 in the p16/ARF region and on chromosome 1q21.3 (ARNT/LASS2/ANXA9 region). The most significant single-nucleotide polymorphisms (SNPs) in the 15q13.1 locus (rs1129038 and rs12913832) lie within a genomic region that has profound effects on eye and skin color; notably, 50% of variability in eye color is associated with variation in the SNP rs12913832. Because eye and skin colors vary across European populations, we further evaluated the associations of the significant SNPs after carefully adjusting for European substructure. We also evaluated the top 10 most significant SNPs by using data from three other genome-wide scans. Additional in silico data provided replication of the findings from the most significant region on chromosome 1q21.3 rs7412746 (P = 6 × 10(-10)). Together, these data identified several candidate genes for additional studies to identify causal variants predisposing to increased risk for developing melanoma.
Published genome-wide association studies (GWASs) have identified few variants in the known biological pathways involved in lung cancer etiology. To mine the possibly hidden causal single nucleotide polymorphisms (SNPs), we explored all SNPs in the extrinsic apoptosis pathway from our published GWAS dataset for 1154 lung cancer cases and 1137 cancer-free controls. In an initial association analysis of 611 tagSNPs in 41 apoptosis-related genes, we identified only 10 tagSNPs associated with lung cancer risk with a P value<10(-2), including four tagSNPs in DAPK1 and three tagSNPs in TNFSF8. Unlike DAPK1 SNPs, TNFSF8 rs2181033 tagged other four predicted functional but untyped SNPs (rs776576, rs776577, rs31813148 and rs2075533) in the promoter region. Therefore, we further tested binding affinity of these four SNPs by performing the electrophoretic mobility shift assay. We found that only rs2075533T allele modified levels of nuclear proteins bound to DNA, leading to significantly decreased expression of luciferase reporter constructs by 5- to -10-fold in H1299, HeLa and HCT116 cell lines compared with the C allele. We also performed a replication study of the untyped rs2075533 in an independent Texas population but did not confirm the protective effect. We further performed a mini meta-analysis for SNPs of TNFSF8 obtained from other four published lung cancer GWASs with 12 ?214 cases and 47? 721 controls, and we found that only rs3181366 (r2=0.69 with the untyped rs2075533) was associated to lung cancer risk (P=0.008). Our findings suggest a possible role of novel TNFSF8 variants in susceptibility to lung cancer.
DNA repair genes are important for maintaining genomic stability and limiting carcinogenesis. We analyzed all single nucleotide polymorphisms (SNPs) of 125 DNA repair genes covered by the Illumina HumanHap300 (v1.1) BeadChips in a previously conducted genome-wide association study (GWAS) of 1154 lung cancer cases and 1137 controls and replicated the top-hits of XRCC4 SNPs in an independent set of 597 cases and 611 controls in Texas populations. We found that six of 20 XRCC4 SNPs were associated with a decreased risk of lung cancer with a P-value of 0.01 or lower in the discovery dataset, of which the most significant SNP was rs10040363 (P for allelic test=4.89 x 10??). Moreover, the data in this region allowed us to impute a potentially functional SNP rs2075685 (imputed P for allelic test=1.3 x 10?³). A luciferase reporter assay demonstrated that the rs2075685G>T change in the XRCC4 promoter increased expression of the gene. In the replication study of rs10040363, rs1478486, rs9293329, and rs2075685, however, only rs10040363 achieved a borderline association with a decreased risk of lung cancer in a dominant model (adjusted OR=0.80, 95% CI=0.62-1.03 and P=0.079). In the final combined analysis of both the Texas GWAS discovery and replication datasets, the strength of the association was increased for rs10040363 (adjusted OR=0.77, 95% CI=0.66-0.89, P(dominant)=5 x 10?? and P for trend=5 x 10??) and rs1478486 (adjusted OR=0.82, 95% CI=0.71-0.94, P(dominant)=6 x 10?³ and P for trend=3.5 x 10?³). Finally, we conducted a meta-analysis of these XRCC4 SNPs with available data from published GWA studies of lung cancer with a total of 12,312 cases and 47,921 controls, in which none of these XRCC4 SNPs was associated with lung cancer risk. It appeared that rs2075685, although associated with increased expression of a reporter gene and lung cancer risk in the Texas populations, did not have an effect on lung cancer risk in other populations. This study underscores the importance of replication using published data in larger populations.
Previous studies have shown that MDM2 SNP309 and p53 codon 72 have modifier effects on germline P53 mutations, but those studies relied on case-only studies with small sample sizes. The impact of MDM4 polymorphism on tumor onset in germline mutation carriers has not previously been studied.
Genome-wide case-control studies have been widely used to identify genetic variants that predispose to human diseases. Such studies are powerful in detecting common genetic variants with moderate effects, but quickly lose power as allele frequency and genotype relative risk decrease. Because patients with one or more affected relatives are more likely to inherit disease-predisposing alleles of a genetic disease than patients without family histories of the disease, sampling patients with affected relatives almost always increases the frequency of disease predisposing alleles in cases and improves the power of case-control association studies. This paper evaluates the power of case-control studies that select cases and/or controls according to their family histories of disease. Our results showed that this study design can dramatically increase the power of a case-control association study for a wide range of disease types. Because each additional affected relative of a patient reduces the required sample size roughly by a pair of case and control, inclusion of cases with affected relatives can dramatically decrease the required sample size and thus the cost of such studies.
Recently, genetic association findings for nicotine dependence, smoking behavior, and smoking-related diseases converged to implicate the chromosome 15q25.1 region, which includes the CHRNA5-CHRNA3-CHRNB4 cholinergic nicotinic receptor subunit genes. In particular, association with the nonsynonymous CHRNA5 SNP rs16969968 and correlates has been replicated in several independent studies. Extensive genotyping of this region has suggested additional statistically distinct signals for nicotine dependence, tagged by rs578776 and rs588765. One goal of the Consortium for the Genetic Analysis of Smoking Phenotypes (CGASP) is to elucidate the associations among these markers and dichotomous smoking quantity (heavy versus light smoking), lung cancer, and chronic obstructive pulmonary disease (COPD). We performed a meta-analysis across 34 datasets of European-ancestry subjects, including 38,617 smokers who were assessed for cigarettes-per-day, 7,700 lung cancer cases and 5,914 lung-cancer-free controls (all smokers), and 2,614 COPD cases and 3,568 COPD-free controls (all smokers). We demonstrate statistically independent associations of rs16969968 and rs588765 with smoking (mutually adjusted p-values<10(-35) and <10(-8) respectively). Because the risk alleles at these loci are negatively correlated, their association with smoking is stronger in the joint model than when each SNP is analyzed alone. Rs578776 also demonstrates association with smoking after adjustment for rs16969968 (p<10(-6)). In models adjusting for cigarettes-per-day, we confirm the association between rs16969968 and lung cancer (p<10(-20)) and observe a nominally significant association with COPD (p = 0.01); the other loci are not significantly associated with either lung cancer or COPD after adjusting for rs16969968. This study provides strong evidence that multiple statistically distinct loci in this region affect smoking behavior. This study is also the first report of association between rs588765 (and correlates) and smoking that achieves genome-wide significance; these SNPs have previously been associated with mRNA levels of CHRNA5 in brain and lung tissue.
We genotyped individuals with primary biliary cirrhosis and unaffected controls for suggestive risk loci (genome-wide association P < 1 x 10(-4)) identified in a previous genome-wide association study. Combined analysis of the genome-wide association and replication datasets identified IRF5-TNPO3 (combined P = 8.66 x 10(-13)), 17q12-21 (combined P = 3.50 x 10(-13)) and MMEL1 (combined P = 3.15 x 10(-8)) as new primary biliary cirrhosis susceptibility loci. Fine-mapping studies showed that a single variant accounts for the IRF5-TNPO3 association. As these loci are implicated in other autoimmune conditions, these findings confirm genetic overlap among such diseases.
Systemic sclerosis (SSc) is an autoimmune disease characterized by fibrosis of the skin and internal organs that leads to profound disability and premature death. To identify new SSc susceptibility loci, we conducted the first genome-wide association study in a population of European ancestry including a total of 2,296 individuals with SSc and 5,171 controls. Analysis of 279,621 autosomal SNPs followed by replication testing in an independent case-control set of European ancestry (2,753 individuals with SSc (cases) and 4,569 controls) identified a new susceptibility locus for systemic sclerosis at CD247 (1q22-23, rs2056626, P = 2.09 x 10(-7) in the discovery samples, P = 3.39 x 10(-9) in the combined analysis). Additionally, we confirm and firmly establish the role of the MHC (P = 2.31 x 10(-18)), IRF5 (P = 1.86 x 10(-13)) and STAT4 (P = 3.37 x 10(-9)) gene regions as SSc genetic risk factors.
Recent genome-wide association studies (GWASs) have identified common genetic variants at 5p15.33, 6p21-6p22 and 15q25.1 associated with lung cancer risk. Several other genetic regions including variants of CHEK2 (22q12), TP53BP1 (15q15) and RAD52 (12p13) have been demonstrated to influence lung cancer risk in candidate- or pathway-based analyses. To identify novel risk variants for lung cancer, we performed a meta-analysis of 16 GWASs, totaling 14 900 cases and 29 485 controls of European descent. Our data provided increased support for previously identified risk loci at 5p15 (P = 7.2 × 10(-16)), 6p21 (P = 2.3 × 10(-14)) and 15q25 (P = 2.2 × 10(-63)). Furthermore, we demonstrated histology-specific effects for 5p15, 6p21 and 12p13 loci but not for the 15q25 region. Subgroup analysis also identified a novel disease locus for squamous cell carcinoma at 9p21 (CDKN2A/p16(INK4A)/p14(ARF)/CDKN2B/p15(INK4B)/ANRIL; rs1333040, P = 3.0 × 10(-7)) which was replicated in a series of 5415 Han Chinese (P = 0.03; combined analysis, P = 2.3 × 10(-8)). This large analysis provides additional evidence for the role of inherited genetic susceptibility to lung cancer and insight into biological differences in the development of the different histological types of lung cancer.
Asbestos exposure is a known risk factor for lung cancer. Although recent genome-wide association studies (GWASs) have identified some novel loci for lung cancer risk, few addressed genome-wide gene-environment interactions. To determine gene-asbestos interactions in lung cancer risk, we conducted genome-wide gene-environment interaction analyses at levels of single nucleotide polymorphisms (SNPs), genes and pathways, using our published Texas lung cancer GWAS dataset. This dataset included 317 498 SNPs from 1154 lung cancer cases and 1137 cancer-free controls. The initial SNP-level P-values for interactions between genetic variants and self-reported asbestos exposure were estimated by unconditional logistic regression models with adjustment for age, sex, smoking status and pack-years. The P-value for the most significant SNP rs13383928 was 2.17×10(-6), which did not reach the genome-wide statistical significance. Using a versatile gene-based test approach, we found that the top significant gene was C7orf54, located on 7q32.1 (P = 8.90×10(-5)). Interestingly, most of the other significant genes were located on 11q13. When we used an improved gene-set-enrichment analysis approach, we found that the Fas signaling pathway and the antigen processing and presentation pathway were most significant (nominal P < 0.001; false discovery rate < 0.05) among 250 pathways containing 17 572 genes. We believe that our analysis is a pilot study that first describes the gene-asbestos interaction in lung cancer risk at levels of SNPs, genes and pathways. Our findings suggest that immune function regulation-related pathways may be mechanistically involved in asbestos-associated lung cancer risk.
Recent meta-analyses of European ancestry subjects show strong evidence for association between smoking quantity and multiple genetic variants on chromosome 15q25. This meta-analysis extends the examination of association between distinct genes in the CHRNA5-CHRNA3-CHRNB4 region and smoking quantity to Asian and African American populations to confirm and refine specific reported associations. Association results for a dichotomized cigarettes smoked per day phenotype in 27 datasets (European ancestry (N = 14,786), Asian (N = 6,889), and African American (N = 10,912) for a total of 32,587 smokers) were meta-analyzed by population and results were compared across all three populations. We demonstrate association between smoking quantity and markers in the chromosome 15q25 region across all three populations, and narrow the region of association. Of the variants tested, only rs16969968 is associated with smoking (P < 0.01) in each of these three populations (odds ratio [OR] = 1.33, 95% CI = 1.25-1.42, P = 1.1 × 10(-17) in meta-analysis across all population samples). Additional variants displayed a consistent signal in both European ancestry and Asian datasets, but not in African Americans. The observed consistent association of rs16969968 with heavy smoking across multiple populations, combined with its known biological significance, suggests rs16969968 is most likely a functional variant that alters risk for heavy smoking. We interpret additional association results that differ across populations as providing evidence for additional functional variants, but we are unable to further localize the source of this association. Using the cross-population study paradigm provides valuable insights to narrow regions of interest and inform future biological experiments.
Genome-wide association studies have identified variants on chromosome 15q25.1 that increase the risks of both lung cancer and nicotine dependence and associated smoking behavior. However, there remains debate as to whether the association with lung cancer is direct or is mediated by pathways related to smoking behavior. Here, the authors apply a novel method for mediation analysis, allowing for gene-environment interaction, to a lung cancer case-control study (1992-2004) conducted at Massachusetts General Hospital using 2 single nucleotide polymorphisms, rs8034191 and rs1051730, on 15q25.1. The results are validated using data from 3 other lung cancer studies. Tests for additive interaction (P = 2 × 10(-10) and P = 1 × 10(-9)) and multiplicative interaction (P = 0.01 and P = 0.01) were significant. Pooled analyses yielded a direct-effect odds ratio of 1.26 (95% confidence interval (CI): 1.19, 1.33; P = 2 × 10(-15)) for rs8034191 and an indirect-effect odds ratio of 1.01 (95% CI: 1.00, 1.01; P = 0.09); the proportion of increased risk mediated by smoking was 3.2%. For rs1051730, direct- and indirect-effect odds ratios were 1.26 (95% CI: 1.19, 1.33; P = 1 × 10(-15)) and 1.00 (95% CI: 0.99, 1.01; P = 0.22), respectively, with a proportion mediated of 2.3%. Adjustment for measurement error in smoking behavior allowing up to 75% measurement error increased the proportions mediated to 12.5% and 9.2%, respectively. These analyses indicate that the association of the variants with lung cancer operates primarily through other pathways.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.