Research Part II: Hawthorne, John Henry, and Pygmalion Artwork

OCS Field Guide: A PT Podcast

Pass the OCS exam by studying smarter, not harder. This podcast is for physical therapists looking to become board-certified specialists in orthopedics. Use code FIELDGUIDE for $101 off a MedBridge subscription.

DISCLAIMER: The information in this podcast is shared for educational purposes only and should not be regarded as medical advice. Always consult with an appropriate licensed provider if you have medical questions or concerns.

All Episodes

OCS Field Guide: A PT Podcast

Research Part II: Hawthorne, John Henry, and Pygmalion

September 23, 2020 • David Smelser and Austin Kercheville • Season 1 • Episode 3

0:00 | 16:18

Dr. David Smelser discusses psychological pitfalls of human research, including the placebo, nocebo, Hawthorne, John Henry, and Pygmalion (also called "Rosenthal") effects and some ways to mitigate them.

Support the show

Use code FIELDGUIDE for $101 or more off a Medbridge subscription.

Support the podcast and get study guides and bonus episodes at Patreon.com/physiofieldguide.

Find more resources and subscribe to practice questions at PhysioFieldGuide.com.

Welcome back for part two in our series on research-related information that you need to know for the OCS exam. In the last episode we introduced a somewhat mind-numbing list of statistics and figures you need to memorize. This episode should be much more exciting.

Today we’re going to talk about some features and pitfalls of human research that could easily show up on the exam. Humans are funny, and the way we are determined to try to make sense of the world around us and respond accordingly makes us very tricky research subjects. The beliefs of human subjects can dramatically affect the outcome of a research study, which makes interpreting research done on human subjects very difficult. Because a board-certified specialist needs to be able to interpret human research accurately, it is likely that you will find some questions on your exam that test your knowledge of these pitfalls of human research.

So here are the topics we’re going to cover in this podcast: the placebo effect, the nocebo effect, the Hawthorne effect, the John Henry effect, and the Pygmalion effect.

First, the placebo effect. The placebo effect is probably the best-known effect in human research. We know that the term “placebo” refers to a treatment that looks like a medical treatment but is known to have no medical or therapeutic value. A pharmaceutical example would be a fake pill that looks like a real pill. The term “placebo effect” applies to a situation where an individual believes this inert treatment will work, and that belief has therapeutic value, so the individual improves.

Due to the placebo effect, high-quality research should use a placebo group whenever possible to make sure the treatment group improves more than the placebo group. If the treatment group improves more than the placebo group, we can conclude that the treatment may have actual therapeutic value. If it improves at the same rate as the placebo group, we might be concerned that the subjects’ beliefs were the real source of improvement and not any therapeutic feature of the treatment.

There are two very important things to note about placebos. First, effective placebo controls need to look like the real treatment. In pharmaceuticals, this is easy: make the placebo pill look like the actual pill. In physical medicine, this is very hard. Consider studies that attempt to determine the effectiveness of arthroscopic knee surgeries. If the study designers have a surgical group and a placebo group, but the placebo they use is sham ultrasound, then the subjects in the placebo group are probably going to have a pretty good idea that the therapeutic value of their treatment is lower than the surgical group. To have a true placebo control, the placebo group needs to think they are getting an equivalent intervention as the intervention group. So the best placebo-controlled study for examining the effectiveness of arthroscopic knee surgery would use a placebo group that goes through all the motions: the prep, the anesthesia—even the incisions and maybe the scoping—so that the patients’ beliefs about the effectiveness of the treatment will be the same as the true surgery group.

Obviously, this is not done often. But when it is done, the results are fascinating. According to a 2019 systematic review by Abram et al in the British Journal of Sports Medicine, two studies have compared arthroscopic partial meniscectomies to placebo or sham surgeries. In both studies, there was no significant difference between the true surgery groups and the placebo or sham groups at 6-12 months. We will discuss this more when we talk about the knee, but here’s an exam pearl for you to mull over now: if the OCS exam gives you a case scenario where a patient has knee pain, and they give you an option to refer the patient for an arthroscopic partial meniscectomy, you probably don’t want to pick that option.

So the first important thing to note about placebos is that the placebo needs to look like the real treatment. The second important thing to note is that placebos have the greatest effect on outcomes that are mediated by the brain. For example, if you are studying individuals with ACL ruptures, and you give the subjects a placebo surgery or placebo pill or placebo physical intervention, that placebo will not repair the ACL or decrease laxity with a Lachmann’s test. However, outcomes mediated by the brain—pain, fear, perceived functional ability, and sometimes even strength and motor unit recruitment—may all improve in response to a placebo. So when designing a study, it is most important to use a placebo control when your primary outcomes are mediated by the brain. (And that covers most of what we’re interested in as physical therapists.)

The nocebo effect is the negative counterpart to the placebo effect. Just as positive beliefs about an inert treatment may result in therapeutic benefit to the subjects, negative beliefs may result in worse outcomes and negative reactions from subjects. In pharmaceutical trials that use sugar pills as placebos, about 20% of subjects in the placebo group report negative side effects like dizziness, drowsiness, headaches, and nausea. If the subjects were specifically asked about negative side effects, then more than 20% reported side effects.

So the nocebo effect is going to be exaggerated if it is somehow suggested or implied to subjects that they should expect negative side effects. It is also going to be exaggerated if the subjects have had negative experiences with a given treatment in the past. If a blue pill made an individual vomit in the past, and the placebo pill is blue, the subject is more likely to report nausea or vomiting.

For this reason, it is often useful in our research to use exclusion criteria to keep out individuals who may have had either negative or positive experiences with a treatment in the past. For example, if researchers want to examine the effect of spinal manipulation, it would be best to exclude individuals who frequently seek out manipulation to minimize the placebo effect and to exclude individuals who have had negative experiences with manipulation to minimize the nocebo effect. If no such exclusion criteria is used, randomization will also help minimize these effects.

But you’re not here to design research studies. You’re here to pass the OCS exam. Here’s the thing: the OCS could very well present a research design and ask you either to identify one of these effects that was not controlled for, or provide a modification that would have controlled for one of these effects. So you need to know what these effects are and how to control for them.

Let’s move on to the Hawthorne effect. The Hawthorne effect is named after a series of quasi-experiments performed in the 1920s and 30s at a Western Electric telephone assembly plant outside of Hawthorne, IL. The experiments were performed to see what effect environmental changes might have on worker productivity. The most famous Hawthorne experiments involved examining whether changing lighting conditions would increase or decrease worker productivity. Researchers found that whether lighting was increased or decreased, worker productivity improved at the same amount. After the research ended, worker productivity went back to baseline levels. The conclusion that is often drawn is that subjects who know they are being observed as part of a research study tend to work harder than they would otherwise. This is called the Hawthorne effect.

More recent studies have challenged the Hawthorne effect, but it is still generally accepted that the Hawthorne effect is a legitimate threat to research validity. In our research, we are most likely to find the Hawthorne effect in areas like adherence to home exercise programs. If you—as a clinician treating patients—are trying to apply some great new research to your own clinical practice, and you aren’t getting results quite as good as the researchers, you might suspect that the research group worked harder than your patients due to the Hawthorne effect—due to the fact that they were being observed in a research study.

A related concept is the observer effect. While the Hawthorne effect says that participating in a research study changes how hard subjects work, the observer effect more broadly says that people work harder when they are being watched, or, in healthcare, they report more improvement in response to more attention. In our research, a well-designed study would attempt to control for this by making sure all treatment groups get about the same amount of attention from the clinician. Imagine, for example, a research design focused on knee OA that compares exercise to manual therapy to both exercise and manual therapy. If the exercise group receives 30 minutes of treatment, the manual therapy group receives 30 minutes of treatment, and the “both therapies” group receives 30 minutes of each for 60 minutes total, you could see how this research design is vulnerable to the observer effect. The group that gets more attention is more likely to improve than the groups that received less attention.

Next, we will talk about the John Henry effect. This effect is named after the 19th Century folk legend John Henry, a railroad worker who, when he heard about a steam drill that was supposed to be able to do his job faster than he could, worked so hard to beat the steam drill that he died in the process. In research, the John Henry Effect describes a scenario where a control group perceives that they are disadvantaged compared to the experimental group, and so they work harder than they would otherwise to overcome that disadvantage. In our literature, this might mean that the control group seeks out other treatments in addition to the control therapy, or performs more self-treatment on their own. The perception among some subjects that they are not receiving the best possible treatment drives them to improve their situation in other ways, which threatens the validity of the research.

The best way to prevent the John Henry effect is to blind the subjects so that they do not know if they are in the control group or the experimental group. If subjects don’t know they are in a control group, they are less likely to feel disadvantaged and modify their situation, thus invalidating the study.

Finally, let’s talk about the Pygmalion effect. Pygmalion was an ancient Greek sculptor in Ovid’s Metamorphoses. According to the legend, Pygmalion sculpted a female statue so beautiful that he fell in love with it. He begged the gods to give him a wife just like his statue, and his wish was granted: his statue came alive and was turned into a real woman. The Pygmalion effect describes how the expectations of those in authority shape the outcomes of their subjects. It is sometimes also called the Rosenthal effect, named after a researcher who performed an experiment with elementary school children and teachers. Rosenthal and his partner Jacobsen gave students an IQ test at the beginning of the school year. They then randomly selected 20% of the students and identified these random students as “intellectual bloomers” to their teachers. These students were not actually the highest scoring, but their teachers believed they were. After eight months, another IQ test was performed, and the randomly selected students who were believed to be “intellectual bloomers” by their teachers improved to a greater extent than the other 80% of students. This suggests that the teachers’ expectations made them treat the supposed “blooming” students differently than the rest of the students, which resulted in improved outcomes.

Like many quasi-experiments from the 20th Century, these results have been called into question recently. However, it is possible that researchers or authority figures in a study can have expectations that affect how they treat subjects or affect how subjects perform. For example, surgeons who organize a study comparing surgery to physical therapy might be biased into believing the surgical group should improve more than the physical therapy group. This might be subtly communicated to the patients either in explanations about what to expect from physical therapy, or in the tone of voice or body language used in the assessments. Likewise, physical therapists who perform a study comparing two treatments—let’s say concentric vs. eccentric exercise—and who believe in one treatment, like eccentric exercise, more than the other might push their preferred treatment group harder than the concentric exercise group.

So again, how do we try to prevent the Pygmalion or Rosenthal effect? We can try to mitigate the Pygmalion effect by blinding the clinicians administering the treatment and the assessments. If the subjects are treated and assessed by clinicians who don’t know which treatment group the subjects are in, they are less likely to influence the outcome of the study. Note that this is basically the reverse of how we control for the John Henry effect. We mitigate the John Henry Effect by blinding participants; we mitigate the Pygmalian or Rosenthal effect by blinding the clinicians. This is why so-called “double blind” studies are so important.

So that covers the placebo, nocebo, Hawthorne, John Henry, and Pygmalion effects. Again, I think it is likely that you will see these terms pop up somewhere amidst the research questions—or they will at least show up as distractors. Know what they are—and how to minimize their effects. And I think the best way to remember the definition of each effect is to remember the stories that go along with each name.

Next time we plan to leave the research behind and get into some more clinically relevant information. We’re also going to start sending practice questions soon, so if you haven’t signed up for our practice questions, click the link in the podcast description and sign up.