“Medical Breakthroughs” — Evaluating Media Reports of Medical Progress

Ellen Mahoney, M.D. FACS

Mention the media treatment of medical progress and you are likely to get groans from physicians, the public, and the media. Hype, shock, sound bites, and conflicting reports have overwhelmed physicians and the public. Journalists, whether print or broadcast, are accused of not covering the “real issues,” when the truth is that many medical and scientific journalists are good people trying their best to communicate information that they feel the public needs to know.

The general view is that the news media is primarily a public service provided for their welfare. While this is the goal of many journalists, it is not necessarily a goal of their industry. The first step in evaluating media reports of medical progress is to realize that someone is trying to sell you something, be it a magazine, television show, or radio report. Fear, conflict, and shock, rather than complex information, sell in our entertainment-oriented media outlets, and the more controversial or alarming the subject and the more likely the subject applies to you, the more likely you are to watch or to buy. Education is a secondary benefit. What the public also may not recognize is that the medical literature is not immune from hype as researchers vie for publicity that may further their funding or status.

Resist the short-attention-span version of medical progress reports and strive for the most detailed explanation you can find. Whether you are a member of the public or a journalist tackling a topic, begin with skepticism but learn how to identify the valuable studies. Resist the impression to consider a reported result valid and useful just because it is in print or worthy of air time, or even because it emanates from a distinguished institution. If you are a scientist, there are simple things you may be able to do that will help put your results in perspective.

Exploratory Studies

The first question to explore is the type of study reported. Initially this will probably be a descriptive, exploratory study. Causes and explanations are not the goal, but rather a test of whether effort should be expended to study the question further. The results of these studies are not meant to be used in clinical decision-making, but since they are creative, simple, and numerous, they may make good press. An example would be the observation that women who have their first term pregnancy in middle age may have more breast cancer. The public, hearing this, assumed that the observation translated into a clinical recommendation or that they had increased their risk by delaying childbirth. In extreme cases, women raced to have children by 29, feeling that they were then safer than if they had waited until 31—an unwarranted response. We do not know what causes breast cancer, and we do not yet know the explanation for the association, but the observation points us in the direction of considering the timing of complex hormonal interactions, and it tells us that further study on this topic may be fruitful. Results like this can be used as an argument for funding more detailed and expensive follow-up work to explore associations to determine causation. One of the difficulties is that should the improved study methods result in showing that the descriptive study was on the wrong track, the published results of the prior study are still available and prone to citation, causing further confusion. Also, since the descriptive studies are fairly crude, contradictory results may be obtained, leading to further burnout.

Explanatory Studies

The next level of study design aims to confirm, and hopefully to actually explain, the observed phenomenon. Explanatory studies can still be observational in nature but with a slightly tighter design, such as a case-control trial, or they can be experimental, such as the gold standard (but still imperfect) prospective randomized clinical trial. Both of these paradigms are improvements, but they are not guarantees that the results are meaningful. And even scientists can have trouble remembering that statistical significance doesn't always equal clinical significance.

In addition to giving more credibility to more sophisticated study design, it is important to understand the data reported, and to be skeptical of results reported in “percentage increase or decrease“ or in “relative risk.“ Numbers can be valuable when making comparisons within the study itself, but in order to responsibly apply the results to clinical decision-making, it is crucial to know the number of events and time. For instance, in 1999 the NSABP reported results of its first prevention trial for tamoxifen. The results were widely touted as showing a 49% reduction in breast cancer when women at high risk for the disease took tamoxifen for 5 years. The actual results were 4.3 cases of breast cancer per 100 women in the study over 5 years with placebo, and 2.2 cases of breast cancer per 100 women per 5 years with tamoxifen. The comparison between 4.3 and 2.2 was the basis of the 49% reduction cited. While these data are indeed accurate, many women and their clinicians who were trying to decide whether to take tamoxifen felt misled. Not as widely reported were the 0.5 cases of endometrial carcinoma per 100 women over 5 years if they were on placebo and 1.3 cases if they were on tamoxifen. These are still very small numbers, but if they were to be consistent in reporting data, they would report a 260% increase in endometrial cancer, offsetting the benefit of the tamoxifen. None of this should diminish the progress made by the NSABP in the past 20 years, but it can be difficult to extract these more helpful numbers from the data presented. For purposes of reporting and evaluating data though, one must insist on having those numbers.

Relative Risk

Another commonly used method of reporting data is relative risk, which must also be backed up by information about actual numbers involved. For instance, when two groups are studied with respect to increased risk from a particular attribute, a relative risk (RR) of 1.0 is assigned if the groups are the same. When the groups differ, the number is higher or lower than 1.0. In order to emphasize the imprecision of this commonly used statistic, some scientists state that the numbers should vary by a factor of 3 in order to be meaningful. Even this does not tell the whole story though, and to get that, one needs to know what the baseline risk is in absolute terms. For instance, if an event usually happens once in a million times, and a comparison group shows 3 events in a million after a particular intervention, that is an increase of 300% or a RR of 3.0. It sounds strong until you realize that the event still only happens 3 in a million times. If however, an event usually happens 100 times in a thousand, and a particular risk factor confers a RR of 3, it will happen 300 times in a thousand. The excess 200 cases are much more significant, even though the percentage and relative risk are the same. Relative risk in breast cancer is frequently applied to risk factors for ever getting the disease, and without information on baseline risk, the results are wide open to misinterpretation. The point is that unless you know the baseline risk being used in the comparison, the results can be very misleading.

In summary, there are several questions to ask when deciding on the relevance of a news report of a so-called “medical breakthrough,“ but the first principle is to approach the report with skepticism. You are being sold something, you are the final judge, and it is up to the reporting agency to convince you that the results reported are both valid and applicable to you.

Once you have a proper attitude, try to determine the study design from the report. Descriptive studies of possible associations should be viewed with mild interest at most.

Next, be sure the facts are straight and that the result applies to the group you are interested in. In recent years there have been a series of epidemiological studies of a possible association between tricyclic antidepressants and increased breast cancer risk. The better the study design and the larger the numbers of patients involved, the less likely that association has validity, and it is now considered unlikely. Meanwhile, many breast cancer patients are resisting the use of SSRI antidepressants such as Celexa, Effexor, or Paxil, not recognizing that not only is any association between antidepressants and breast cancer weaker, but also the drugs prescribed for them are in an entirely different class.

Insist on some reference to actual numbers of events per group of patients during a specified time and/or a specified baseline risk. Until you have those numbers, you cannot judge the magnitude of the benefit or any offsetting downside. All of medicine is a matter of balancing risks and benefits, and this information is not contained in results presented as percentages or relative risks. Also realize that statistical significance in itself is not necessarily proof that a hypothesis is correct or that it is relevant to the way the human body works. It does become more convincing if similar results are obtained in repeated studies on the same question.

Ask yourself if the result makes sense. For instance, the recent results showing that breast self-examination (BSE) is not sufficient for screening a population of healthy women for breast cancer does not mean that you should stop doing BSE. It makes sense to know the texture of your breasts; it does not make sense to make BSE your sole screening strategy.

References for further reading: