Premature progesterone and freeze all cycles: do we really need randomized controlled trials?

"Consider This"

Like Comment


Micah J. Hill, D.O., Mae Wu Healy, D.O.

Consider This:

Over the past 6 years, evidence has increased that premature progesterone elevation decreases live birth in IVF cycles. The putative mechanism is advancing the endometrium, leading to endometrial-embryo asynchrony. This is supported by evidence suggesting that the premature progesterone elevation affects the endometrium at several detectable levels: gene expression, siRNA, implantation markers, histologic features, and ultrasonographic characteristics (1). This negative effect is summarized in a meta-analysis of observational data including over 60,000 IVF cycles and should be considered incontrovertible (2). It has further been observed that this negative effect does not occur in oocyte donor-recipient cycles, suggesting that removing the endometrium from the negative hormonal milieu resolves the problem of premature progesterone elevation. This is further supported by data showing no association of premature progesterone elevation with oocyte or embryo quality. This leads to the logical postulation that freezing all embryos and performing a subsequent frozen embryo transfer (FET) cycle is the appropriate management of premature progesterone elevation. We have published observational data supporting this hypothesis, clearly showing a negative effect of premature progesterone elevation in fresh but not subsequent frozen cycles (3). However, only one small randomized controlled trial (RCT) has examined this hypothesis and it was underpowered to detect a difference between fresh and frozen transfer strategies (4). A study powered to detect a difference in live birth of 50% in FET cycles versus 25% fresh cycles in patients with premature progesterone elevation would only require 110 subjects to be randomized. The question is do we really need a large RCT to explore this hypothesis?

The principle of equipoise states that clinical trials should only be undertaken if there is reasonable uncertainty that the intervention may be of benefit. In the context of this debate, there is strong evidence that premature progesterone elevation negatively impacts fresh IVF cycles. There is strong evidence that this effect does not carry over into subsequent FET cycles. The combined evidence and biologic plausibility suggest with great likelihood that the negative effect of premature progesterone elevation can be ameliorated by freezing embryos and subsequent FET. So do we need a RCT to demonstrate this before we implement that strategy into our practices? Is it ethical based on the principal of equipoise to even perform such an RCT? Our group has published several studies investigating the effect of premature progesterone elevation, so such a RCT was the logical next step for us. However, we recently decided that it would be unethical to conduct an RCT to investigate fresh versus frozen transfer in the presence of premature progesterone elevation. The evidence against fresh embryo transfer in these patients is overwhelming and there was great certainty that half of the subjects would be exposed to unnecessary harm.

We know prematurely elevated progesterone is bad in fresh cycles. We know this effect does not carry over to frozen cycles. Do we need an RCT to tell us that a freeze all strategy is the best way to manage cycles with premature progesterone elevation? I believe the answer is no. As recently stated by Braakhekke et al. in their review of equipoise applied to reproductive medicine “If it is evident that a treatment ‘works’ based on insight into the underlying pathophysiologic processes and a few clinical observations, it is unnecessary and even unethical to perform an RCT” (5).

  1. Labarta E, Martinex-Conejero JA, Alama P, Horcajadas JA, Pellicer A, Simon C, Bosch E. Endometrial receptivity is affected in women with high circulating progesterone levels at the end of the follicular phase: a functional genomics analysis. Hum Reprod 2011;26:1813-25
  2. Venetis CA, Kolibianakis EM, Bosdou JK, Tartlatizis BC. Progesterone elevation and probability of pregnancy after IVF: a systematic review and meta-analysis of over 60,000 cycles.
  3. Healy MW, Patounakis G, Connell MT, Devine K, DeCherney AH, Levy MJ, Hill MJ. Does a frozen embryo transfer ameliorate the effect of elevated progesterone seen in fresh transfer cycles? Fertil Steril 2016;105:93-9.
  4. Yang S, Pang T, Li R, Yang R, Zhen X, Chen X,, et al. The individualized choice of embryo transfer timing for patients with elevated serum progesterone level on te HCG day in IVF/ICSI cycles: a prospective randomized clinical study. Gynecol Endocrinol 2015;31:355-8.
  5. Braakhekke M, Mol F, Mastenbroek S, Mol BW, van der Veen F. Equipoise and the RCT. Hum Reprod in press

Fertility and Sterility

Editorial Office, American Society for Reproductive Medicine

Fertility and Sterility® is an international journal for obstetricians, gynecologists, reproductive endocrinologists, urologists, basic scientists and others who treat and investigate problems of infertility and human reproductive disorders. The journal publishes juried original scientific articles in clinical and laboratory research relevant to reproductive endocrinology, urology, andrology, physiology, immunology, genetics, contraception, and menopause. Fertility and Sterility® encourages and supports meaningful basic and clinical research, and facilitates and promotes excellence in professional education, in the field of reproductive medicine.


Go to the profile of Micah J Hill
about 4 years ago
I look forward to any comments or debate about this topic. Please share you views!
Go to the profile of Kenan Omurtag
about 4 years ago
This is hard to argue against. mic(ah) drop
Go to the profile of Mario Vega Croker
about 4 years ago
This is a very valid point. At this stage, the evidence is solid, what remains somewhat unclear is the level of progesterone (ng/mL) that should be used as a cut-off. I believe I read a paper from your group (NIH), using a cut-off of 2ng/mL, other papers quote 1.5ng/mL and in real world practice once it approaches 1.3ng/mL physicians start getting nervous. Based on your review of the literature, do you offer a strict cut-off or do you go by the behavior of the levels (steady increase vs. rapid increase)?
Go to the profile of Micah J Hill
about 4 years ago
Thank you for the comments. On the cutoffs question, we have used 2 ng/ml as our threshold from a clinical perspective and in our studies. Venetis et al. latest meta-analysis shows significance when P crosses 0.8 ng/ml. Like many screening tests, I believe the question on where to set the P threshold should take into context the clinical ramifications of the decision of an abnormal test and the PPV and NPV of those thresholds. The consequence of a false positive P test include the added time delay and cost of an unnecessary freeze all. The consequence of a false negative P test include transferring an embryo into an endometrium with reduced live birth potential. So where a practice sets it thresholds might vary based on how they value the consequence of those false test outcomes, how good their FET rates are, and even how many supernumerary embryos a patient has.
We are half way through writing a paper that explores these questions. We identified 16 statistical methods for establishing thresholds and applied them to our P data. We are presenting the data at SRI next month. I can demonstrate a statistical reduction in live birth when P crosses 0.7ng/ml with a large enough data set, but that threshold doesn't make mush sense because the majority of patients would be classified as abnormal test results. On the other end of the spectrum, I can identify thresholds over 3 ng/ml using the 95th percentile of donor stimulations, but clearly a threshold this high leaves many autologous patients at risk. There was a cluster of statistical tests that identify thresholds between 1.5 and 2 ng/l. At these thresholds, cost analysis is also beneficial for freeze all and the number of patients classified as having an abnormal test would be clinically reasonable.
Its a long answer to your question and doesn't include a discussion of the increase coefficients of variance in P assay results at these lower serum levels. But hopefully our paper will be done soon and you can critically asses the various thresholds we propose for yourself.
Go to the profile of Alexander Quaas
about 4 years ago
This is a very nicely written paper on the subject of premature progesterone elevation and its effect on live birth in fresh IVF cycles. More specifically it is an argument against the need for RCTs, when such convincing and extensive observational data are already available.
At the current time, some practices routinely check P levels in fresh IVF cycles and some don’t.
Therefore, other questions that arise on this topic are:
1) Should the standard of care be to check P levels in fresh IVF cycles for all IVF practices? Do practices that routinely check P levels have better outcomes?
2) If providers (in the absence of RCT evidence) start adopting a freeze-all strategy for premature P elevations, should each practice develop their own practice-specific cutoffs based on their data? Or should each practice adopt the cutoff that may eventually be suggested by experts such as Micah based on the statistical considerations in his comment? As we well know, there are variations in the assay etc from one practice to another.
3) Putting it more broadly: in general, what class data is required to trigger a major change in practice for a clinic / provider? Adding an extra test routinely for each IVF monitoring patient, and monitoring levels / setting up an internal QI program etc requires a lot of extra effort. Is the current evidence enough to justify this?
Go to the profile of Ivan Valencia
about 4 years ago
Level of Progesterone chosen to cancel the fresh transfer depends mainly on the platform you are using in your clinic to measure progesterone. You can not extrapolate the result from IVI or other clinic to your own practice simply because you are not using the same lab. You should look to your own pregnancy rates and progesterone levels measure in house to determine your own cut off.
Ivan Valencia MD
Go to the profile of Micah J Hill
about 4 years ago
Ivan, I agree with you as a general principal. However, for smaller programs it could take years of running progesterone levels to detect a threshold for a reduction in pregnancy. In this case, I think its reasonable to look at the literature and chose published thresholds utilizing the same assay.
Go to the profile of Richard Paulson
about 4 years ago
As Mark Twain noted, "It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so." The critical issue is one of measurement of progesterone. The assays are simply not accurate (consistent, replicable) in the levels that are being measured. I suspect that there is likely an effect of premature luteinization on endometrial receptivity, but where the progesterone threshold lies is far from settled. There are lots of major programs that do not measure serum progesterone levels. Perhaps the missing piece is a study comparing the various progesterone measurement platforms against a true "gold standard," but what is that gold standard? Mass spectrometry? I'm not convinced.
Go to the profile of Micah J Hill
about 4 years ago
Thank you for the comment Dr. Paulson. I agree with the critique of progesterone assays and think this is the greatest current challenge in this area of research. But I think an argument can be made that assay limitations do not negate the utility of measuring progesterone. The progesterone immunoassays have coefficient of variance ranging from 6-8% at high levels and 20% at low levels. But the accuracy of immunoassays is not unique to progesterone, but an issue for all steroid hormones. The coefficient of variance range from 8 to 13% for estradiol across IVF stimulation ranges, not markedly more accurate than progesterone assays. Since progesterone is measured on a scale 1000 fold greater than estradiol, a stimulated woman with an estradiol of 2000 pg/ml and a progesterone of 2 ng/ml has similar amounts of each hormone. So if we are comfortable using estradiol assays to manage our IVF cycles, why would the similar accuracy limitations in progesterone assays negate their clinical utility? The testosterone and AMH literature discusses similar issues, yet we utilize those assays with an understanding of their limitations.
From a practical/anecdotal perspective, our lab runs progesterone and estradiol on every blood draw for our IVF patients. Progesterone tends to be within 0.1 ng/ml on repeated measures daily on patients. In a small group of patients, it rises rapidly at the end of stimulation. But the assay results from day to day tend to be very similar. We have also had the lab run the same samples multiple times, and those results also tend to be the same or within 0.1 ng/ml. We have not had the same experience with AMH, which tends to very much more both within and between samples.
In our initial publication on premature progesterone, blastocyst live birth rate dropped from 60% with the lowest levels of progesterone to <10% when progesterone was over 2 ng/dl. Similar results have been demonstrated repeatedly in numerous studies using several assays in many groups. This is summarized in the meta-analysis from Venetis of over 60,000 IVF cycles. Live birth is a highly complex event, with any individual variable having only small ability to accurately predict that event. But these studies suggest that the predictive ability of progesterone is similar or higher than other variables we routinely use. As an example, published data shows progesterone has a higher predictive value for live birth than blastocyst morphology. Morphology also has high inter and intra observer variability. Yet we all use it in our decision making.
Go to the profile of Micah J Hill
about 4 years ago
All of that to say we have work to do on developing more accurate assays that are clinically deployable and on developing a gold standard threshold. It ain't Mark Twain, but don't "throw the baby out with the bathwater" in the meantime!