Findings
Questions
Context
Implications
Future Research
Despite the breadth of the evidence of hormonal, non-hormonal and energy-based interventions for GSM, few studies evaluated identical combinations of interventions, comparators, and outcomes and there were often important limitations in study design and outcome reporting, leading to low or very low COE for most conclusions. In general, vaginal estrogen, vaginal DHEA, oral ospemifene, and vaginal moisturizers may all improve at least some GSM symptoms, though the effects versus placebo appear to be generally modest. However, the evidence does not currently support the efficacy of CO2 laser, Er:YAG laser, vaginal or systemic testosterone, vaginal oxytocin, or oral raloxifene or bazedoxifene for any GSM symptoms. There was not a strong signal for frequent serious adverse effects in short-term, relatively small studies, but lack of long-term data on endometrial safety of hormonal interventions represents a critical gap. Evidence supports current practice to try low-cost over-the-counter vaginal moisturizers and lubricants first for most GSM symptoms. Future studies would be strengthened by a standard definition and uniform diagnostic criteria for GSM, a common set of validated outcome measures and reporting standards, and attention to clinically relevant patient populations and intervention comparisons.
From 11,993 unique search results, we identified 172 eligible articles, consisting of 151 unique studies. Most eligible studies were of hormonal interventions and addressed vulvovaginal atrophy symptoms or sexual function. We identified zero studies addressing KQ1, 155 addressing KQ2, and 120 addressing KQ3. Zero studies directly addressed KQ4 and KQ5, so indirect evidence for those questions was reviewed in the KQ2 and KQ3 studies, respectively.
For hormonal and energy-based interventions, and vaginal moisturizers, we provided "GRADE" certainty of evidence (COE) ratings for 8 patient-centered outcomes identified by the Core Outcomes in Menopause (COMMA) review as most important to patients and clinicians, including: (1) pain with sex, (2) vulvovaginal dryness, (3) vulvovaginal discomfort/irritation, (4) dysuria, (5) change in Most Bothersome Symptom (MBS), (6) distress, bother, or interference of genitourinary symptoms, (7) satisfaction with treatment, and (8) treatment side effects. We derived COE from statistical rather than clinical significance or effect magnitude, in part because validated measures of clinically meaningful differences do not exist for most outcomes. We rated COE Very Low, Low, Moderate or High .
Lowa,b
Improves
Very Lowa,b,c
Uncertain
Lowa,c
No difference
Lowa,b
No difference
Lowa,c
Improves
Moderated
No Difference
Lowa,d
Higher
Lowa,c
No difference
Lowa,d
Improves
Lowa,d
No difference
Lowa,d
No difference
Lowa,d
Improves
Lowa,d
No difference
Very Lowa,c,d
Uncertain
Lowa,d
No difference
Lowa,d
No difference
Moderatea
Higher
Lowa,e
Improves
Very Lowa,f
Uncertain
Lowa,e
Improves
Very Lowa,b,e
Uncertain
Lowa,d
Improves
Lowa,c
More
Very Lowe,f
Uncertain
Very Lowc,f
Uncertain
Very Lowa,b,d
Uncertain
Very Lowa,f
Uncertain
Very Lowa,b,d
Uncertain
Moderated
No Difference
Lowc,d
No difference
Lowa,b
Improves
High
No difference
Lowa,b
Improves
High
Higher
Lowa,c
No difference
Very Lowa,f
Uncertain
Lowf
No difference
Very Lowa,c,d
Uncertain
Lowa,d
Improves less
Very Lowa,c,f
Uncertain
Very Lowd,f
Uncertain
Very Lowd,f
Uncertain
Very Lowc,f
Uncertain
Lowc,d
No difference
Lowf
No difference
Very Lowa,c,d
Uncertain
Lowa,c
Improves
Very Lowa,b,c
Uncertain
Lowa,d
No difference
Very Lowa,c,d
Uncertain
Moderatea
No difference
Very Lowa,b,d
Uncertain
Very Lowa,b,d
Uncertain
Lowa,d
No difference
Very Lowa,b,d
Uncertain
Lowa,d
No difference
Lowa,d
No difference
Very Lowa,c,d
Uncertain
Very Lowa,d,g
Uncertain
Lowc,d
No difference
Lowf
No difference
Lowf
No difference
Lowc,d
No difference
Lowf
No difference
Very Lowc,f
Uncertain
Very Lowf,g
Uncertain
Very Lowa,f
Uncertain
Very Lowa,f
Uncertain
Very Lowa,f
Uncertain
Very Lowa,f
Uncertain
Very Lowa,c,f
Uncertain
Very Lowa,c,d
Uncertain
Lowf
No difference
Very Lowf,g
Uncertain
Very Lowa,c,f
Uncertain
Very Lowa,c,f
Uncertain
Very Lowa,c,f
Uncertain
Very Lowa,c,f
Uncertain
Very Lowa,f,g
Uncertain
Very Lowa,f
Uncertain
Explanations for downgrading:
- Downgraded one level for study limitations (one or more trials assessed as "some concerns" for RoB)
- Downgraded one level for inconsistency (effect varied across trials)
- Downgraded one level for study limitations (one or more trials did not provide adequate statistical reporting)
- Downgraded one level for imprecision (total sample size less than OIS of 400)
- Downgraded one level for imprecision (SD crosses no-effect threshold)
- Downgraded two levels for imprecision (total sample size much less than OIS of 400)
- Downgraded one level for imprecision (no or few events)
Key Questions
After discussion with key informants and our team's content and methods experts, we chose to interpret the term "screening" in KQ1 as identifying underreported, symptom-based conditions (similar to screening for anxiety and depression), rather than "screening" for an asymptomatic condition. Based on input from public commenters, key informants, and members of a Technical Expert Panel, we drafted the following key questions.
Key Question 1: What is the effectiveness and harms of screening strategies to identify GSM in postmenopausal women? Does screening impact patient reported symptoms or improve QoL?
Key Question 2: What is the effectiveness and comparative effectiveness of hormonal, non-hormonal, and energy-based interventions when used alone or in combination for treatment of GSM symptoms? Which treatments show improvement for which symptoms?
Key Question 3: What are the harms (and comparative harms) of hormonal, non-hormonal, and energy-based interventions for GSM symptoms?
Key Question 4: What is the appropriate follow-up interval to assess improvement, sustained improvement, or regression of symptoms of GSM in women treated with hormonal, non-hormonal, and energy-based interventions?
Key Question 5: What is the effectiveness, comparative effectiveness, and harms of endometrial surveillance among women who have a uterus and are using hormonal therapy for GSM?
For all key questions, how do the findings vary for women with a history of breast cancer or other hormone-related cancers, a high risk of cancer, or conditions such as primary ovarian insufficiency, women experiencing surgical menopause, gender diverse individuals, and within subgroups defined by severity of GSM symptoms, and patient characteristics (i.e., by age, race, socioeconomic status, etc.)?
Broader Context
The diversity and volume of the included literature reflect the evolving GSM nomenclature as well as the ongoing deliberation in the field about the definition of GSM as a syndrome (versus a collection of one or more symptoms), the number and type of symptoms and/or physical signs required to make a diagnosis of GSM, and the causal relationship between menopause and GSM symptoms. Relatively few studies focused on patients with sexual (k=6) or urinary (k=3) symptoms only, while the majority included patients with vulvovaginal or some combination of symptoms. Only 21% of included studies required a prior clinical diagnosis related to GSM or VVA (or associated symptoms) for participant inclusion, with an additional 13% verifying eligibility via physical exam or questionnaire. Two-thirds of studies relied on self-reported GSM symptoms, perhaps more reflective of clinical practice, with inconsistent inclusion requirements for an elevated vaginal pH or vaginal atrophy on epithelial maturation evaluation. The broad definition of GSM applied to this review increased the scope of applicable studies and may be consistent with how GSM is defined by clinicians in practice. However, it likely also increased heterogeneity of studies and limited strength of evidence for synthesis of findings.
The large number of non-hormonal therapies included in our evidence map (46 unique interventions) may reflect a broader interest in complementary and alternative therapies. Enthusiasm for hormonal menopausal therapies has waxed and waned historically, with increased interest described recently in the lay press.
Implications for Clinical Practice
The findings detailed in this systematic review summarize and evaluate the existing literature about GSM screening and treatment.
For Key Question (KQ) 1 (screening for GSM), we found no RCTs or prospective observational studies with a concurrent control group that evaluated the potential effectiveness and harms of screening for GSM. Though screening (or case-finding) is generally thought to be low-risk, it may have trade-offs to consider, including the medicalization of a natural process,9 the costs of treatment (psychological and monetary), and the potential side-effects of treatment. That said, many women report being bothered by GSM symptoms but not discussing them with their clinicians; it is possible that asking women about the presence of symptoms may lead to symptom improvement for some women using a relatively low-risk, low-cost local treatment.
For KQ2 and KQ3, (efficacy, comparative effectiveness, and harms of interventions to treat GSM symptoms), vaginal estrogen, vaginal DHEA, ospemifene, vaginal testosterone, and vaginal moisturizers may all improve at least some GSM symptoms. However, evidence does not clearly demonstrate the efficacy of vaginal oxytocin, vaginal or systemic testosterone, oral raloxifene or bazedoxifene, or energy-based therapies such as CO2 or Er:YAG laser, for any GSM symptoms. Harms reporting for most interventions was limited in part by studies not being sufficiently powered to evaluate infrequent but serious harms though most studies did not report frequent serious harms. As noted earlier, populations and symptom severity enrolled in these studies may not fully reflect patients seen in many clinical settings, especially those in primary care. Clinicians may choose to tailor treatment based on side effects, personal risk factors (e.g, cancer history), insurance coverage or cost, and patient preference for route or type of therapy and availability of treatments. Evidence supports current practice to try low-cost over-the-counter vaginal lubricants and moisturizers first for most GSM symptoms. A broad review of efficacy and harms of non-hormonal treatments other than moisturizers, particularly herbal and botanical supplements, was hindered by study limitations.
For KQ4 (timing of evaluations) we found no studies that directly addressed appropriate follow-up intervals, but limited evidence suggests that symptoms began improving within 1-2 months for effective treatments and continued to improve through 12 weeks (average length of study follow-up). Frequent evaluation and the selected timing in trials may not be practical for many patients and clinicians. Few studies evaluated outcomes beyond 6 months thus providing little empiric evidence to guide clinicians for prolonged therapy.
For KQ5 (endometrial surveillance) we found no studies that directly addressed the effectiveness and harms of endometrial surveillance with respect to patient centered outcomes. In more than half of trials that used active or passive surveillance, vaginal estrogen was associated with cases of vaginal bleeding, a nominal increase in endometrial thickness, proliferative endometrium, and one case of endometrial hyperplasia in a polyp. Importantly, no studies performed transvaginal ultrasound or endometrial biopsy in women receiving vaginal estrogen for more than 12 weeks. Ospemifene was associated with thickened endometrial lining, proliferative endometrial histology, and one case of endometrial hyperplasia in studies up to 1 year. Limited evidence suggests that vaginal DHEA, vaginal oxytocin, oral bazedoxifene and raloxifene, and vaginal testosterone are not associated with clinically relevant endometrial stimulation in primarily short-term and 3 year-long studies. Notably, all the trials that used transvaginal ultrasound or endometrial biopsy for active endometrial surveillance also excluded patients with any baseline abnormalities on ultrasound or biopsy. However, it is not standard clinical practice to perform a screening transvaginal ultrasound or endometrial biopsy to rule out endometrial abnormalities prior to initiating hormonal GSM treatments, so it is unclear how these trial findings would generalize to an unscreened clinical population. There is little evidence to guide clinicians on endometrial surveillance in clinical practice.
This review does not provide cost information.
Limitations of the Evidence Base
The main strength of this evidence base is the breadth of US-available interventions that have been tested for a wide array of GSM symptoms over the past 40 years. We identified higher quality (e.g., low or some concerns RoB) evidence examining the use of multiple formulations of vaginal estrogen, 6 different non-estrogen hormonal therapies, vaginal moisturizers, and 3 different energy-based therapies. Forty-five percent of included studies (45%) were industry-sponsored, and 15% did not report the source of funding. We identified limitations of the studies that underwent full data extraction and of the non-hormonal intervention studies related to populations studied, interventions and comparators evaluated, and outcomes reported, as described below. We also noted a lack of evidence related to screening for GSM, and only indirect evidence related to appropriate follow-up intervals and endometrial surveillance.
Populations. Overall, the literature focused on postmenopausal predominantly white women. Though GSM symptoms tend to be more prevalent with increasing age, most studies enrolled women shortly after menopause, in their 50s and early 60s. Most women had moderate to severe baseline symptoms. Women were actively recruited, screened for study eligibility, and monitored for treatment adherence. Therefore, the applicability of evidence to an older population, racial/ethnic minorities, those with less severe symptoms or potentially less treatment adherent, is limited. Treatments with the potential for serious systemic adverse effects (such as venous thromboembolism) were generally not tested in higher risk populations, such as those with a past history or risk factors for the conditions, or in older patients. Obese women, who are at higher risk for endometrial cancer and venous thromboembolism, were excluded from many studies. During topic refinement, our key informants and technical expert panel indicated interest in GSM treatment for diverse patient populations, including those from non-white racial and ethnic backgrounds, transgender patients, women who underwent surgical menopause, and women with a history of breast cancer. However, over half of studies did not report racial or ethnic backgrounds of included study participants. All remaining studies included at least 80% white participants and over a quarter of total studies included at least 90% white participants. Less than half of studies reported inclusion of women with a hysterectomy, while a quarter excluded women with past hysterectomy; the remainder were unclear. Studies inconsistently reported whether past hysterectomy included removal of the ovaries (inducing surgical menopause) or uterus alone (reducing the risk of endometrial hyperplasia), and only one-third of studies reported the proportion of participants per study arm without a uterus. Studies did not commonly report on differences in outcomes for participants with surgical menopause compared with natural menopause. We found no studies in transgender populations, or among postpartum or lactating women, who may experience GSM symptoms due to a hypoestrogenic state. Over half of studies excluded women with a history of cancer or at high risk for cancer, while only 6 studies specifically recruited active or treated cancer patients (2 energy-based, 1 vaginal DHEA, 1 vaginal moisturizer, 1 vaginal testosterone compared with placebo, and 1 vaginal testosterone compared with vaginal estrogen). Most studies were small, with median n=47 participants per treatment arm. Less than one-third of studies included over 100 participants per arm (only 6 studies had n=200-380 participants per arm). Trials of herbal and botanical supplements were almost all small, with nearly 90% of studies under 100 people. Finally, many studies, especially studies of energy-based interventions, recruited women from specialty care, rather than primary care patient populations, and it was often unclear how many treatments women had previously tried for their symptoms.
Interventions. There was broad variability in dosing, formulations, and routes of delivery for study interventions. Most treatments were delivered vaginally, with the rest administered orally or transdermally. Even within intervention categories, variation in treatment dosing and route of delivery contributed to heterogeneity that largely precluded meta-analysis. For example, among the non-estrogen hormonal interventions, there were studies of vaginal and oral DHEA, and oral, transdermal, and vaginal testosterone. Several interventions were evaluated in combination with concurrent treatment with either estrogen or as-needed vaginal lubricant. Each of these treatment differences limited our ability to synthesize studies and draw conclusions with a higher COE. Vaginal moisturizers are a heterogeneous group of substances intended to replace or mimic vaginal secretions typical of a premenopausal state or alter the vaginal pH to better support a typical premenopausal vaginal flora. Though dozens of formulations of vaginal moisturizers are commercially available, we found only 4 with published studies that met inclusion criteria for review (many other moisturizer publications were excluded due to ineligible study population, lack of control arm, or inadequate length of follow-up). Finally, trials examining herbal and botanical supplements studied a broad variety of formulations and combination therapies that made synthesis challenging.
Comparators. The description of interventions used for control arms was highly variable and sometimes missing or incomplete. Many studies reported providing a "matched" placebo tablet, capsule, gel, or cream; others did not define the nature of the control or the composition of the placebo treatment. Many studies reported symptom improvement in participants who received lubricating vaginal placebo creams or gels. For this reason, studies with an oral placebo tablet, such as ospemifene trials, or no-treatment control groups, may have been more likely to demonstrate a difference between intervention and control groups. For a condition that is often treated symptomatically with vaginal lubricants, "placebo" creams and gels may represent an effective therapy, and the content of those creams or gels may be relevant for understanding the relative efficacy of the intervention under investigation.
Outcomes. In consultation with partners, key informants, and technical expert panel members, we focused on studies with patient-centered outcomes. Patient-centered outcomes are most relevant for clinicians and patients seeking symptom relief, and multiple studies have found that presence and severity of physical exam findings do not directly correlate with self-reported GSM symptoms. However, patient-centered outcomes are inherently subjective and the validity of many study-specific or visual analogue measures used remains unknown. GSM is defined broadly to include a multitude of vulvovaginal, sexual, and urinary symptoms, and there are no outcome measures that holistically assess GSM severity. Studies included a large number of outcomes assessed using a diverse array of measurement tools. Though the COMMA review has defined a set of core outcomes,49 they have yet to identify validated or preferred measures for assessing each outcome. Our ability to synthesize findings through meta-analysis was limited by the heterogeneity of measurement and reporting of outcomes. Among the measures frequently used, several use categorical/ordinal scales, which makes it difficult to interpret a change in scores, and few have an established threshold for minimal clinically important (noticeable) differences (MCID). Even when MCIDs are established for a particular population, symptom severity, intervention type and follow-up duration thresholds may vary across these domains. For this reason, we derived our GRADE COE using a "noncontextualized" approach based on statistical measures of significance rather than a partially or fully contextualized approach that includes effect size magnitude or clinically meaningful changes. One of the most commonly used outcomes in GSM studies, MBS severity, was recommended by the FDA as a co-primary endpoint for vaginal estrogen studies in 2003. However, we found that this outcome was reported in different ways throughout the literature: some studies included women with any one of several GSM symptoms as their MBS, and assessed change after treatment, while other studies limited inclusion criteria for participants or outcome assessment to one or two specific symptoms. Though studies within each intervention type generally used a common set of outcome measures, non-hormonal studies (described in the evidence map) used far more heterogeneous tools: 64 unique trials used 44 different outcome measures; 38 measures were used in only 1 or 2 publications. Additionally, though GSM treatments are often prescribed or recommended long-term, only three studies (all non-estrogen hormonal interventions) had a treatment duration of 1 year, while one additional studies of CO2 laser included 12 months of follow-up. The remaining studies all followed participants for less than a year, with most providing just 12 weeks of follow-up. We evaluated only studies with at least 8 weeks of follow-up (eliminating 64 publications with shorter follow-up), but outcomes were still assessed at different timepoints in different studies, which made comparison and synthesis more difficult. Finally, we identified variation and important limitations in data analyses and presentation of findings. Appropriate analyses should include how the change from baseline to follow-up in an outcome measure for the participants in the intervention arm compares with the change from baseline to follow-up in that outcome measure for the participants in the comparator arm (difference of the difference). However, many studies reported only change from baseline to follow-up in the intervention arm (lacking control) or only reported a comparison of the intervention and comparator arms at follow-up (a single time-point does not account for potential baseline differences). Such reporting can bias assessment of treatment effectiveness. Finally, many studies did not provide any measures of statistical significance or confidence intervals around the point estimates of effect. Such methods represent incomplete or inaccurate reporting in the individual studies, especially for readers with inadequate time or experience to construct the calculations themselves. This was particularly noticeable in harms reporting; definitions and reporting of harms/adverse effects also varied widely.
Strengths and Limitations of the Review Process
Our review process had several strengths. We applied a comprehensive and sensitive search filter in 3 large databases (MEDLINE, EMBASE, and CINAHL) in addition to searching reference lists of relevant systematic reviews. In every step of the review after abstract triage, at least two independent reviewers were involved. Discrepancies were discussed and resolved within the research team.
Our review process was also subject to several limitations. We limited our review to studies published in English (introducing potential language bias) and those evaluating interventions available in the US (limiting applicability to clinicians and patients outside the US). This restriction resulted in excluding studies of estriol (the 16-hydroxylated metabolite of estradiol), promestriene (a synthetic estrogen), and tibolone (which has estrogenic, progestogenic, and androgenic activity), all of which are available in many countries outside the US and have contributed to the evidence base for prior systematic reviews. Recent literature suggests that clinicians consider tailoring local estrogen and hormonal therapy based on specific effects and potential adverse effects of individual formulations. We also limited systemic estrogen studies to those comparing a GSM-specific treatment to systemic estrogen (rather than studies comparing routes/doses of systemic estrogen or systemic estrogen vs. placebo), which may have caused us to miss some studies that evaluated GSM symptoms as secondary outcomes (and symptoms for other indications as primary outcomes). We did not separate out the high-risk subgroups of interest (e.g., breast cancer survivors) in our analysis or GRADE. We highlighted which studies included these populations of interest, but GRADE assessments were organized by intervention, not population. We focused on intervention studies, but additional insight into some patient-centered outcomes could be supplemented with observational studies. We excluded some studies and prepared an evidence map of others to focus review scope. We based these decisions on discussion with our operational partners and TEP members and believe that our final group of included/analyzed studies represent those most clinically relevant and methodologically rigorous
Implications for Future Research
A fundamental question for future research is whether common genitourinary symptoms after menopause represent a unified syndrome that can be cohesively diagnosed, studied, and managed. Women treated clinically for individual symptoms such as postmenopausal vulvovaginal dryness or dyspareunia may be different from those who receive a diagnosis of GSM for a clinical trial. Additionally, some genitourinary symptoms in older women are likely related to aging, not postmenopausal hormonal changes, and many postmenopausal women with urinary incontinence or dyspareunia may have experienced these symptoms prior to menopause. Clarifying the diagnostic criteria for GSM has important implications for labeling normal physiologic aging as a disease, and for the associated potential benefits and harms of health care interventions. Additional research is needed to inform the identification and management of GSM in clinical practice. Several professional organizations recommend screening all postmenopausal women for GSM, though there is no standard protocol for how to screen and we found no studies that directly assessed the potential benefits and harms of screening (KQ1).
Future research should directly address the treatment of GSM among subpopulations of interest, especially women with a history of breast cancer, women who have received or are receiving breast or urogenital cancer treatment, and women at high risk for cancer. Evaluating interventions among older and more racially and ethnically diverse women would also enhance our understanding of treatment effectiveness and harms. Long-term follow-up for efficacy, tolerability, and safety represents a critical gap needed to guide treatment longer than one year. Future studies could improve the reporting of adverse effects by reporting reasons for study drop out, assessing whether there is a statistical difference in AE severity and frequency between treatment arms, and assessing tissue-level changes in those with subjective AEs.
With the breadth of interventions studied, several treatment comparisons were notably missing. For example, we found no studies directly comparing the effectiveness of vaginal estrogen with vaginal moisturizers, vaginal DHEA, or ospemifene, and no studies of hybrid lasers or radiofrequency versus sham treatment. For energy-based treatments, future studies should evaluate the combination of laser treatments with other interventions (such as moisturizer), and study different dosing protocols and schedules. We identified 8 studies of various phytoestrogens compared with vaginal estrogen (see non-hormonal evidence map). A future synthesis comparing the effectiveness and harms of phytoestrogens to estrogens could be useful.
We identified a large and heterogeneous literature of studies evaluating non-hormonal interventions for symptoms of GSM. Studies varied extensively in populations enrolled, interventions and comparators assessed, and outcomes reported. Among those evaluating similar interventions, all used different compounds or doses and reported different symptom outcomes, making synthesis impossible. Therefore, we used an evidence map approach to summarize the existing literature by intervention type, according to the National Center for Complementary Integrative Health (NCCIH) framework.
Category | Characteristics | Educational Programs (n=5) * | Mind and Body Practices (n=8) * | Natural Products (n=45) * | Pharmaceutical (n=6) * | Total (n=64) |
---|---|---|---|---|---|---|
Sample Size | ≤60 | 2 | 2 | 16 | 2 | 22 |
61-99 | 3 | 3 | 24 | 2 | 32 | |
100-199 | 0 | 2 | 3 | 2 | 7 | |
200-499 | 0 | 1 | 2 | 0 | 3 | |
Route of Administration | Oral | 0 | 0 | 22 | 5 | 27 |
Vaginal | 0 | 0 | 22 | 1 | 23 | |
Other | 5 | 7 | 1 | 0 | 13 | |
Special Populations | Breast or gynecologic cancer | 3 | 3 | 2 | 1 | 9 |
Country | Africa | 0 | 1 | 0 | 0 | 1 |
Asia (other) | 0 | 2 | 9 | 1 | 12 | |
Asia (Iran) | 2 | 1 | 21 | 0 | 24 | |
Australia | 1 | 0 | 0 | 0 | 1 | |
Europe | 0 | 3 | 6 | 0 | 9 | |
North America | 2 | 0 | 2 | 5 | 9 | |
South America | 0 | 1 | 7 | 0 | 8 | |
Outcomes Reported** | Genital or vulvovaginal symptoms | 2 | 1 | 27 | 2 | 32 |
Urinary symptoms | 1 | 5 | 13 | 4 | 23 | |
Sexual symptoms | 4 | 6 | 34 | 3 | 47 | |
Psychological symptoms | 3 | 4 | 3 | 1 | 11 | |
QoL | 3 | 6 | 7 | 3 | 19 | |
Adverse effects | 1 | 2 | 31 | 5 | 39 |
- One hundred seven publications of hormonal, non-hormonal moisturizers, and energy-based interventions for genitourinary syndrome of menopause (GSM) were assessed for study quality and further synthesis; 66 additional studies of non-hormonal interventions were described in an evidence map without quality assessments.
- Vaginal estrogen, dehydroepiandrosterone (DHEA), and moisturizers as well as oral ospemifene may improve at least some GSM symptoms, primarily vulvovaginal dryness and, to a lesser extent, dyspareunia. Evidence does not demonstrate the efficacy of vaginal or systemic testosterone, vaginal oxytocin, oral raloxifene or bazedoxifene, or energy-based therapies. No treatment significantly improved vaginal discomfort/irritation or dysuria. Placebo effect was high, particularly in studies using a lubricating vaginal gel or cream placebo.
- A broad array of non-hormonal interventions other than moisturizers, including natural products, mind/body practices, and educational interventions, have been tested for various GSM symptoms (particularly sexual function outcomes) in mostly small non-U.S. trials.
- Studies differed in GSM definitions and diagnosis, enrollment criteria, and outcomes assessed for a broad range of vulvovaginal, sexual, and urinary symptoms using heterogeneous and poorly validated patient-reported measurement tools.
- Harms reporting was limited by sample sizes and short followup duration though most studies did not find frequent serious harms. Most studies of hormonal interventions followed participants for 12 weeks, the longest followup period for any intervention was 12 months.
- Few studies enrolled women with a history of breast or gynecologic cancers.
- No studies evaluated the benefits or harms of screening for GSM, the timing of evaluation for response to treatment, or the benefits or harms of endometrial surveillance for patients using hormonal therapies for GSM.
Objectives. To conduct a systematic review of evidence regarding genitourinary syndrome of menopause (GSM) screening, treatment, and surveillance.
Data sources. Ovid/Medline®, Embase®, and EBSCOhost/CINAHL® from database inception through December 11, 2023.
Review methods. We employed methods consistent with the Agency for Healthcare Research and Quality Evidence-based Practice Center Program Methods Guidance to identify studies and synthesize findings for Key Questions related to screening for GSM, effectiveness and harms of U.S.-available interventions for GSM, appropriate followup intervals for patients using GSM treatments, and endometrial surveillance for patients using hormonal GSM treatments. For vaginal estrogen and vaginal or systemic non-estrogen hormonal interventions, energy-based interventions, and vaginal moisturizers, we first assessed study quality and then, for moderate or high-quality studies, reviewed outcomes related to GSM symptoms, treatment satisfaction, and adverse effects. For low-quality studies, we described limited study characteristics only. For studies of other non-hormonal interventions, we created an evidence map describing study characteristics without assessing study quality.
Results. After assessing 107 publications for risk of bias (RoB), we extracted and synthesized effectiveness and/or harms outcomes from 68 publications describing trials or prospective, controlled observational studies that were rated low, some concerns, or moderate RoB (24 estrogen publications, 35 non-estrogen, 11 energy-based, and 4 moisturizers). Of 39 high, serious, or critical RoB publications, we extracted long-term harms from only 15 uncontrolled studies of energy-based interventions (all serious or critical RoB due to confounding). An additional 66 publications evaluating 46 non-hormonal interventions, including natural products, mind/body practices, and educational interventions, were described in an evidence map. Across all 172 publications, studies differed in GSM definitions, diagnosis, enrollment criteria, and outcomes assessed. Few studies enrolled women with a history of breast or gynecologic cancers. Overall, we found that vaginal estrogen, vaginal dehydroepiandrosterone (DHEA), vaginal moisturizers, and oral ospemifene may all improve at least some GSM symptoms, while evidence does not demonstrate the efficacy of energy-based therapies, vaginal or systemic testosterone, vaginal oxytocin, or oral raloxifene or bazedoxifene for any GSM symptoms. Harms reporting was limited, in part, by studies not being sufficiently powered to evaluate infrequent but serious harms, though most studies did not report frequent serious harms. Common non-serious adverse effects varied by treatment and dose. No studies evaluated GSM screening or directly addressed appropriate followup intervals or the effectiveness and harms of endometrial surveillance among women with a uterus receiving hormonal therapy for GSM. The longest followup period for active endometrial surveillance in an included trial was 12 weeks (vaginal estrogen) or 1 year (non-estrogen hormonal interventions).
Conclusions. This systematic review provides comprehensive, up-to-date information to guide patients, clinicians, and policymakers regarding GSM. Despite the breadth of included studies, findings were limited by several factors, including heterogeneity in intervention-comparator-outcome combinations. Future studies would be strengthened by a standard definition and uniform diagnostic criteria for GSM, a common set of validated outcome measures and reporting standards, and attention to clinically relevant populations and intervention comparisons. Lack of long-term data assessing efficacy, tolerability, and safety of GSM treatments leaves postmenopausal women and clinicians without evidence to guide treatment longer than 1 year.
This evidence review was funded by the Patient-Centered Outcomes Research Institute under Contract No. 75Q80120D00008 from the Agency for Healthcare Research and Quality, U.S. Department of Health and Human Services, through a memorandum of Agreement Amendment, number 20-603M-23.
Danan ER, Sowerby C, Ullman KE, et al. Hormonal treatments and vaginal moisturizers for genitourinary syndrome of menopause: a systematic review. Ann Intern Med. 2024 Sep 10. DOI: 10.7326/ANNALS-24-00610
Ullman KE, Diem S, Forte ML, et al. Complementary and alternative therapies for genitourinary syndrome of menopause: an evidence map. Ann Intern Med. 2024 Sep 10. DOI: 10.7326/ANNALS-24-00603
Danan ER, Diem S, Sowerby C, Ullman K, Ensrud K, Landsteiner A, Greer N, Zerzan N, Anthony M, Kalinowski C, Forte M, Abdi HI, Friedman JK, Nardos R, Fok C, Dahm P, Butler M, Wilt TJ. Genitourinary Syndrome of Menopause. Comparative Effectiveness Review No. 272. (Prepared by the Minnesota Evidence-based Practice Center under Contract No. 75Q80120D00008.) AHRQ Publication No. 24-EHC022. PCORI® Publication No. 2024-SR-02 Rockville, MD: Agency for Healthcare Research and Quality; July 2024. DOI: https://doi.org/10.23970/AHRQEPCCER272. Posted final reports are located on the Effective Health Care Program search page.