Age at menopause (AOM) has a substantial impact on fertility and disease risk. While many loci with variants that associate with AOM have been identified through genome-wide association studies (GWAS) under an additive model, other genetic models are rarely considered. Here through GWAS meta-analysis under the recessive model of 174,329 postmenopausal women from Iceland, Denmark, the United Kingdom (UK; UK Biobank) and Norway, we study low-frequency variants with a large effect on AOM. We discovered that women homozygous for the stop-gain variant
rs117316434(A) in CCDC201 (p.(Arg162Ter), minor allele frequency ~1%) reached menopause 9 years earlier than other women (P = 1.3 × 10−15). The genotype is present in one in 10,000 northern European women and leads to primary ovarian insufficiency in close to half of them. Consequently, homozygotes have fewer children, and the age at last childbirth is 5 years earlier (P = 3.8 × 10−5). The CCDC201 gene was only found in humans in 2022 and is highly expressed in oocytes. Homozygosity for CCDC201 loss-of-function has a substantial impact on female reproductive health, and homozygotes would benefit from reproductive counseling and treatment for symptoms of early menopause.
Menopause is caused by the depletion of the primordial follicle pool. There is a broad variation in the age of menopause (AOM), and early menopause (EM) impacts health, quality of life (
https://www.menopausemandate.com/) and fertility potential. It is estimated that natural fertility ends on average 10 years before menopause. At the extreme end of the AOM distribution is primary ovarian insufficiency (POI) with cessation of menses before the age of 40 years, which occurs in 1–4% of women. EM and POI are a well-known cause of infertility, which is increasingly relevant as women in many populations are choosing to have children later in life.
Through genome-wide association studies (
GWAS), we and others have reported associations of rare and low-frequency variants with variation in AOM, mostly under an additive model. Rare variants in several genes have also been reported to cause Mendelian forms of POIalthough many are only reported in a small number of cases or in single families. Despite advances in understanding the genetic causes of EM and POI, genetic screening has mainly been focused on Turner syndrome, which has a prevalence of 1 in 2,000, and the FMR1 premutation, found in 1 in 8,000 women.
We performed a GWAS meta-analysis for AOM under the recessive model as well as the additive one (
not affected by surgical procedures, such as hysterectomy and/or oophorectomy) on 174,329 postmenopausal women from Iceland, the United Kingdom (UK), Denmark and Norway (nIceland = 27,281, nUK = 137,906, nDenmark = 5,978 and nNorway = 3,161; Supplementary Tables). We tested 39.3 million sequence variants for associations with AOM.
Homozygosity (n = 27 women) for the low-frequency stop-gain variant p.(Arg162Ter) (A), chr7: 45863165; minor allele frequency (MAF) ~1%) in CCDC201 is associated with earlier AOM by 9 years than in heterozygotes and noncarriers (recessive effect = −1.59 s.d.; 95% confidence interval (CI): −1.98, −1.20), recessive P = 1.3 × 10−15. The effect of the variant did not differ between the four groups (Phet = 0.28. The association was genome-wide significant in the UK, the largest of the four groups (P = 3.6 × 10−13), and was also significant in the remaining three sample sets combined (P = 2.4 × 10−4. The effect of p.(Arg162Ter) in CCDC201 on
AOM deviates from the additive model and is limited to homozygotes. We did not detect an association with AOM under the additive model (additive effect = 0.029 s.d. (95% CI: −0.0094, 0.066), P = 0.16). We did not find a significant association of the p.(Arg162Ter) variant with any case–control or quantitative traits under the additive model.
AOM was derived for individuals who were considered to have undergone natural menopause not affected by surgical procedures, such as hysterectomy and/or oophorectomy.
In Iceland, we used data on AOM obtained from the Icelandic Cancer Society’s Cancer Registry (n = 9,794) and from questionnaires from various genetic programs at deCODE genetics (n = 21,390), of which the majority was gathered through deCODE’s osteoporosis project and the deCODE Health study, which had also been genotyped. The Cancer Society’s data were collected from a questionnaire in the years 1964–1994, and deCODE genetics data from 1999 to 2022. All Icelandic data were collected through studies approved by the
National Bioethics Committee (approvals VSN-15-198 and VSN-15-214) following review by the Icelandic Data Protection Authority. Participants donated blood or buccal samples after signing a broad informed consent allowing the use of their samples and data in all projects at deCODE genetics approved by the NBC. All personal identifiers of the participants’ data were encrypted by a third-party system, approved and monitored by the Icelandic Data Protection Authority.
The UK Biobank study is a large prospective cohort study of ~500,000 individuals in the age range of 40–69 years from across the UK. AOM (Data Field 3581) was collected from a touchscreen questionnaire at the UK Biobank assessment centers from 140,688 genotyped females who indicated that their periods had stopped (
Data Field 2724). Only British individuals of European ancestry were included in the study. The UK Biobank data were obtained under application 56270. All phenotype and genotype data were collected following informed consent obtained from all participants. The North West Research Ethics Committee reviewed and approved the UK Biobank’s scientific protocol and operational procedures (REC reference: 06/MRE08/65).
Data on menopause status from Denmark were provided by the Danish Blood Donor Study (DBDS). Around 51% of participants were females with an age span at inclusion 18–70 years. The data were obtained from a paper questionnaire (v1) on self-reported health status and lifestyle sent to all participants in the DBDS (n = 110,000) from 2010 to mid-year 2015. Around 85,000 participants responded to it. In the end, AOM from 8,037 chip-typed females was used in the analysis. All participants signed an informed consent statement, and the DBDS genetic study was approved by the Danish National Committee on Health Research Ethics (NVK-1700407) and by the Danish Capital Region Data Protection Office (P-2019-99).
Data on female infertility from Denmark were provided by the Copenhagen Hospital Biobank (CHB) Reproduction Study, which involves a targeted selection of patients with reproductive phenotypes from the CHB, a biobank based on patient blood samples drawn in Danish hospitals.
The AOM data from Norway were provided by the Hordaland Health Studies (HUSK). The HUSK surveys are a collaborative project between the University of Bergen, the Norwegian Health Screening Service (SHUS) and the Municipal Health Service in Hordaland aimed at gathering information so that disease ultimately can be prevented. In the first phase of the studies (HUSK1), in 1992–1993, around 18,000 residents of Hordaland County born in 1925–1952 participated in the study. In 1997–1999 (HUSK2), previous participants born in 1950–1951 and 1925–1927 were re-invited, in addition to all residents in Hordaland County born in 1953–1957. In total, approximately 36,000 individuals participated in the study (18,000 in 1992–1993 and 26,000 in 1997–1999), with some participating at both times. Age at last menstruation (proxy for menopause) was collected from questionnaires sent to participants both in HUSK1 and HUSK2. All participants signed an informed consent statement, and the HUSKment study was approved by the Regional Committee for Medical Research Ethics Western Norway (REK Vest 10279 (2018/915)). In the end, AOM from 3,161 genotyped females was used in the analysis.
For all strata, in the case of repeat measurements, the mean age of menopause or the mean age at the last period was used to represent each individual’s AOM.
Rounding tendency in reported age of menopause
It has been observed that when women are asked to recall their AOM, they tend to report values ending in 0 or 5. Thus, we need to take into account the possibility that some women who reported menopause at the age of 40 years may not have been included as POI cases due to this tendency and could lead to an underestimation of the risk of POI in our study. Of the 27 homozygotes for p.(Arg162Ter) with AOM information, nine reported AOM before the age of 40, while seven reported experiencing menopause exactly at the age of 40. Assuming an equal probability of rounding reported AOM up or down to 40, we estimated the penetrance of POI among homozygotes as 46% ((9 + 3.5)/27). Likewise, for noncarriers and heterozygotes, the estimated penetrance of POI is 3.7% ((4,678 + 1,728.5)/174,302).
Estimating the proportion of POI explained by p.(Arg162Ter) homozygosity
Using AOM data to define POI as AOM before the age of 40 years, we can observe nine homozygotes among the 4,687 females with AOM before the age of 40 years. Thus, we estimate that the proportion of all POI cases caused by p.(Arg162Ter) homozygosity is around 0.19% (that is, 1 of 521). Similarly, taking into account rounding bias, the proportion of all POI cases estimated to be caused by homozygosity is also 0.19% (that is, 1 of 513, or (9 + 3.5)/(4,687 + 1728.5)).
In the UK Biobank 500k WGS set, one homozygote was observed among the 571 females with the ICD-10 diagnostic code E283, indicative of POI. Thus, the incidence of p.(Arg162Ter) homozygosity is 1 of 571 among POI cases.
In Iceland, 34,453,001 sequence variants identified in WGS data from 63,460 Icelanders participating in various disease projects at deCODE genetics were tested. The samples were sequenced using standard TruSeq (Illumina) methodology to an average genome-wide coverage of 40×. SNPs and insertions and deletions (InDels) were identified, and their genotypes were called using joint calling with Graphtyper. Variant Effect Predictor from RefSeq was used to annotate the effects of sequence variants on protein-coding genes. We chip-typed 173,025 Icelanders (around 50% of the population) using Illumina SNP arrays, and the chip-typed individuals were long-range phased. The variants identified in the whole-genome sequencing of Icelanders were imputed into the chip-typed individuals. In addition, based on Icelandic genealogy, the genotype probabilities for 292,636 untyped close relatives of chip-typed individuals were calculated.
From the UK Biobank, we used data from around 428k WGS individuals who were of British/Irish ancestry. The WGS was performed using Illumina standard TruSeq methodology (mean depth of 32×) in a collaborative work between deCODE genetics in Iceland and The Wellcome Sanger Institute in the UK. Sequence variants from the WGS were identified and called jointly using Graphtyper. Phasing from previous chip-typing of the same sample was used as the basis to assign haplotypes.
From Denmark and Norway, we chip-typed 464,016 and 254,304 individuals, respectively. The samples were chip-typed by deCODE genetics using both Omni microarrays (Illumina) and Global Screening Array (Illumina). Graphtyper was used to identify SNPs and InDels and jointly call their genotypes. Using the identified variants, the samples were then phased (using SHAPEIT4) along with an international set of 1,041,174 genotyped individuals from 49 countries (including Denmark and Norway), chip-typed at deCODE genetics. For variant imputation, we compiled an international reference panel from 50,839 WGS individuals from 14 countries, including 10,985 from Denmark and 3,467 from Norway. The identified variants from WGS were subsequently imputed into the chip-typed individuals.
We performed a meta-analysis on GWAS on 180,564 females from Iceland, the UK, Denmark and Norway with self-reported AOM or age at last menstruation. We tested a total of 39,281,741 sequence variants (imputation info >0.80 and MAFIce > 0.02%, MAFUK > 0.01%, MAFDen > 0.1%, MAFNor > 0.2%), identified in the WGS, for association with AOM. The quantitative traits were transformed to a standard normal distribution. For the quantitative traits, the year of birth was included as a covariate in the analysis, with additional adjusting for the first 20 principal components in the UK, for population stratification. For each population, the quantitative traits were tested using a linear mixed model implemented in BOLT-LMM. For the meta-analysis, we used a fixed-effects inverse variance method based on effect estimates and s.e. from each population. For each study, we used linkage disequilibrium (LD) score regression to account for distribution inflation in the dataset due to cryptic relatedness and population stratification. Using a set of about 1.1 million sequence variants with available LD scores, we regressed the χ2 statistics from our GWAS scan against the LD score and used the intercept as a correction factor. The estimated correction factor for AOM, based on LD score regression, was 0.97 for the recessive model in the Icelandic sample, 1.01 in the UK, 1.01 in Denmark and 1.02 in Norway.
We report the effect estimates for POI and EM phenotypes against population controls and as a categorical trait among women who reported AOM (AOM < 40 versus AOM ≥ 40; AOM < 45 versus AOM ≥ 45. The effect estimates from the two methods do not differ significantly, and we have reached the same conclusion (Phet > 0.25).
We applied genome-wide significance thresholds corrected for multiple testing using an adjusted Bonferroni procedure weighted for variant classes and predicted functional impact. With 39,281,741 sequence variants being tested in the meta-analysis, the weights given in were rescaled to control the family-wise error rate. The adjusted significance thresholds are 2.0 × 10−7 for variants with high impact (n = 9,910), 4.0 × 10−8 for variants with moderate impact (n = 202,465), 3.7 × 10−9 for low-impact variants (n = 3,244,032), 1.8 × 10−9 for other variants in DNase I hypersensitivity sites (n = 5,001,568) and 6.1 × 10−10 for all other variants (n =30,823,766).
UK Biobank participants were first grouped by birth country. We then defined regional ancestry groupings with the aim that the groups be representative of the region’s current population, be homogeneous by genetic ancestry and have at least 200 individuals (for accurate estimation of variant frequencies).
In some cases, we split off ancestry-based groupings representing distinct populations or unrepresentative migrant communities (for example, South Asian ancestry born in Africa and West Asia) to achieve homogeneous birthplace-based groupings. Groups depicted on the maps in Supplementary are those best representing the current demographic majority. If countries had fewer than 200 participant birthplaces, we merged them with neighboring countries with similar assessed ancestry profiles. Map geometries were obtained via R package maps and manipulated. The maps in Supplementary are sourced from Natural Earth (
https://www.naturalearthdata.com/about/terms-of-use/).
homozygosity, alleles, genetic disorders, recessive traits, consanguinity, inbreeding, isolated populations, homozygosity mapping, disease-causing genes, inherited disorders, genetic risk, population genetics, genetic variation, autosomal recessive, gene expression, genetic linkage, mutation, genome-wide studies, genetic counseling, targeted therapies,
#Homozygosity, #GeneticDisorders, #RecessiveTraits, #Consanguinity, #Inbreeding, #IsolatedPopulations, #HomozygosityMapping, #DiseaseCausingGenes, #InheritedDisorders, #GeneticRisk, #PopulationGenetics, #GeneticVariation, #AutosomalRecessive, #GeneExpression, #GeneticLinkage, #Mutation, #GenomeWideStudies, #GeneticCounseling, #TargetedTherapies, #GeneticResearch
International Conference on Genetics and Genomics of Diseases
For Enquiries: genetics@healthcarek.com
Get Connected Here
---------------------------------
---------------------------------
facebook.com/profile.php?id=61555903296992
tumblr.com/blog/dorita0211