Ancestry-Shift Refinement Mapping of the C6orf97-ESR1 Breast Cancer Susceptibility Locus

Abstract
We used an approach that we term ancestry-shift refinement mapping to investigate an association, originally discovered in a GWAS of a Chinese population, between rs2046210[T] and breast cancer susceptibility. The locus is on 6q25.1 in proximity to the C6orf97 and estrogen receptor α (ESR1) genes. We identified a panel of SNPs that are correlated with rs2046210 in Chinese, but not necessarily so in other ancestral populations, and genotyped them in breast cancer case∶control samples of Asian, European, and African origin, a total of 10,176 cases and 13,286 controls. We found that rs2046210[T] does not confer substantial risk of breast cancer in Europeans and Africans (OR = 1.04, P = 0.099, and OR = 0.98, P = 0.77, respectively). Rather, in those ancestries, an association signal arises from a group of less common SNPs typified by rs9397435. The rs9397435[G] allele was found to confer risk of breast cancer in European (OR = 1.15, P = 1.2×10−3), African (OR = 1.35, P = 0.014), and Asian (OR = 1.23, P = 2.9×10−4) population samples. Combined over all ancestries, the OR was 1.19 (P = 3.9×10−7), was without significant heterogeneity between ancestries (Phet = 0.36) and the SNP fully accounted for the association signal in each ancestry. Haplotypes bearing rs9397435[G] are well tagged by rs2046210[T] only in Asians. The rs9397435[G] allele showed associations with both estrogen receptor positive and estrogen receptor negative breast cancer. Using early-draft data from the 1,000 Genomes project, we found that the risk allele of a novel SNP (rs77275268), which is closely correlated with rs9397435, disrupts a partially methylated CpG sequence within a known CTCF binding site. These studies demonstrate that shifting the analysis among ancestral populations can provide valuable resolution in association mapping. In genome-wide association studies of disease susceptibility, there is no particular expectation that a genotyped SNP showing an association is itself a pathogenic variant. Rather, it is more likely that a SNP giving a signal does so because it is in linkage disequilibrium (LD) with a pathogenic variant. When the analysis is shifted to a population of another ancestry, the tagging relationship between the genotyped SNP and the pathogenic variant may be disrupted, due to differing patterns of LD between populations. Thus, it is not straightforward to determine whether a susceptibility locus identified in one ancestral population is also associated with risk in another. Moreover, the differing patterns of LD between ancestral populations can be used to gain resolution in genetic mapping. We refer to this approach as ancestry-shift refinement mapping. Here, we apply it to a breast cancer risk variant near the estrogen receptor α gene that was initially described in a Chinese population. We show that the tagging relationship between the originally described SNP rs2046210 and the pathogenic variant(s) is not maintained in Europeans and Africans. We identify a SNP, rs9397435, that is associated with breast cancer risk in populations of Asian, European, and African ancestry.