There's no difference—are you sure?

Inklings

Like Comment

Volume 108, Issue 2, Pages 231–232

Authors:

David R. Meldrum, M.D., H. Irene Su, M.D., M.S.C.E.

Abstract:

In September 2014, we discussed the importance of an a priori sample-size calculation and other considerations in planning and conducting a study to be sure that it is sufficiently powered to detect the smallest clinically relevant effect size. A reader can then interpret an adequately powered, but negative, study to mean that the difference between treatment groups is not larger than that predetermined effect size and can use the information to support therapeutic decisions. Sample-size calculations are now required by many Institutional Review Boards, funding agencies, and medical journals.

Read the full text here.

Fertility and Sterility

Editorial Office, American Society for Reproductive Medicine

Fertility and Sterility® is an international journal for obstetricians, gynecologists, reproductive endocrinologists, urologists, basic scientists and others who treat and investigate problems of infertility and human reproductive disorders. The journal publishes juried original scientific articles in clinical and laboratory research relevant to reproductive endocrinology, urology, andrology, physiology, immunology, genetics, contraception, and menopause. Fertility and Sterility® encourages and supports meaningful basic and clinical research, and facilitates and promotes excellence in professional education, in the field of reproductive medicine.

2 Comments

Go to the profile of Micah J Hill
Micah J Hill over 3 years ago

Thank you Drs. Meldrum and Su for this discussion of the importance of sample size and power calculations in the interpretation of clinical studies.  The importance of reporting effect sizes with corresponding 95% CIs for absolute risks and relative risks (RR) or odds ratios (OR) was great to read.  Most of the discussion in your commentary focused on real life examples of RR and OR.  We wanted to add to the discussion the importance of reporting primary outcomes with the RR/OR plus the absolute risk difference and the numbers needed to treat (NNT) or harm (NNH).  Both RR and OR give only a relative comparison of risk between two groups, which may or may not be clinically relevant depending on how rare the outcome is, regardless of statistical significance.  Absolute risk differences and NNT/NNH put these relative comparisons into a clinically relevant framework. 


Take, for example, two studies, both with a RR of 2.0 (95%CI 1.5-2.5).  If only the RR and its associated 95%CI are considered, we would assume both studies have found an equally important effect size. In one of these studies, the risk was 1/1000 for a bad outcome without an exposure, but that increases to 2/1000 with a known exposure.  In this case, the absolute risk difference is only 1/1000.  The NNH with that exposure is 1000.  In a second study, subjects without a treatment have a 25/100 live birth rate compared to 50/100 live birth rate with an experimental intervention.  The RR is still 2.0.  However, the absolute risk difference is 25% and the NNT is 4.  While RR has told us the relative effect size is similar in these two studies, absolute risk difference and NNT/NNH have informed us that the clinically relevant absolute effect size in these two studies is vastly different.    

While these hypothetical studies use extreme examples of absolute risk differences to illustrate a point, such situations exist in reproductive literature.  IVF is associated with a significant increase in the odds of certain imprinting disorders by about 5-fold, but the NNH measures in the thousands.  Many of the risks of hormone therapy for postmenopausal women had narrow 95%CI for the RR estimate, but the NNH with therapy required numerous patients to be treated to see a single negative outcome.
Unfortunately, in the example of our two hypothetical studies, many authors will report and emphasize their results in terms of RR 2.0 (95%CI 1.5-2.5).  Both studies would likely discuss how such a finding is highly significant.   It often is left to the reader to delve into the tables to find the absolute risk difference and then calculate the NNT/NNH.  While the experienced reader may do this as a matter of habit and training, leaving out such reporting may misrepresent the effect size to other readers.  It should be incumbent on the authors to fully report the primary outcome with not only RR/OR, but also absolute risk difference and NNT/NNH with 95%CI for each.  This should be the primary way results are reported in both the abstract and results section and this should frame the context of the discussion.  We hope that authors will report their primary outcomes and frame the discussion in such context, that reviewers will hold authors accountable to such reporting, and that readers will interpret the clinical relevance of the results as such.   


Thank you for such a great discussion of effect size and sample size. Your Inklings highlights the importance of power analysis and the risks of type II error. We hope such educational articles as yours continue to be published in the journal!

 

Micah J Hill, DO
Combined Federal REI Fellowship  

George Patounakis, MD, PhD
Reproductive Medicine Associates of Florida

Go to the profile of Erma Zimmerman Drobnis
Erma Zimmerman Drobnis over 3 years ago

Thank you so much for this article!  I feel like I lecture on this until I'm blue in the face, even more so as a reviewer of manuscripts.  Adequate power and interpretation of effect size are so important to our communication and interpretation of results.  Hear, hear!