Introduction
“Cash can’t purchase happiness.” “You’ll be able to’t decide a guide by its cowl.” “An apple a day retains the physician away.”
You’ve most likely heard these sayings a number of instances, however do they really maintain up once we have a look at the info? On this article collection, I wish to take fashionable myths/sayings and put them to the take a look at utilizing real-world knowledge.
We would affirm some sudden truths, or debunk some fashionable beliefs. Hopefully, in both case we are going to achieve new insights into the world round us.
The speculation
“An apple a day retains the physician away”: is there any actual proof to assist this?
If the parable is true, we should always anticipate a detrimental correlation between apple consumption per capita and physician visits per capita . So, the extra apples a rustic consumes, the less physician visits individuals ought to want.
Let’s look into the info and see what the numbers actually say.
Testing the connection between apple consumption and physician visits
Let’s begin with a easy correlation verify between apple consumption per capita and physician visits per capita.
Information sources
The info comes from:
Since knowledge availability varies by 12 months, 2017 was chosen because it supplied probably the most full when it comes to variety of nations. Nonetheless, the outcomes are constant throughout different years.
![](https://towardsdatascience.com/wp-content/uploads/2025/02/1_5a7AIjhZ4MU46cT80075Hg-1.png)
![](https://towardsdatascience.com/wp-content/uploads/2025/02/1_FbJKhtRcdfbiHXBqrxpjnA-1.png)
Visualizing the connection
To visualise whether or not increased apple consumption is related to fewer physician visits, we begin by taking a look at a scatter plot with a regression line.
![](https://towardsdatascience.com/wp-content/uploads/2025/02/1_NYaEvgQlcCq66aCQp0vAoQ-1-1.png)
The regression plot reveals a very slim detrimental correlation, which means that in nations the place individuals eat extra apples, there’s a barely noticeable tendency to have decrease physician visits.
Sadly, the development is so weak that it can’t be thought-about significant.
OLS regression
To check this relationship statistically, we run a linear regression (OLS), the place physician visits per capita is the dependent variable and apple consumption per capita is the unbiased variable.
![](https://towardsdatascience.com/wp-content/uploads/2025/02/1_Ul4KJI8sFIBoy-_bKFCT5g-1-1024x430.png)
The outcomes affirm what the scatterplot advised:
- The coefficient for apple consumption is -0.0107, which means that even when there may be an impact, it’s very small.
- The p-value is 0.860 (86%), excess of the usual significance threshold of 5%.
- The R² worth is sort of zero, which means apple consumption explains just about none of the variation in physician visits.
This doesn’t strictly imply that there isn’t a relationship, however reasonably that we can’t show one with the accessible knowledge. It’s potential that any actual impact is just too small to detect, that different components we didn’t embrace play a bigger function, or that the info merely doesn’t replicate the connection effectively.
Controlling for confounders
Are we executed? Not fairly. Thus far, we’ve solely checked for a direct relationship between apple consumption and physician visits.
As already talked about, many different components might be influencing each variables, doubtlessly hiding a real relationship or creating a synthetic one.
If we contemplate this causal graph:
![](https://towardsdatascience.com/wp-content/uploads/2025/02/1_ygZSTyix3dlyaPLQvhTV2g-1-1024x627.png)
We’re assuming that apple consumption immediately impacts physician visits. Nonetheless, different hidden components may be at play. If we don’t account for them, we threat failing to detect an actual relationship if one exists.
A well known instance the place confounder variables are on show comes from a research by Messerli (2012), which discovered an fascinating correlation between chocolate consumption per capita and the variety of Nobel laureates.
So, would beginning to eat quite a lot of chocolate assist us win a Nobel Prize? In all probability not. The probably clarification was that GDP per capita was a confounder. That implies that richer nations are likely to have each increased chocolate consumption and extra Nobel Prize winners. The noticed relationship wasn’t causal however reasonably attributable to a hidden (confounding) issue.
The identical factor might be taking place in our case. There may be confounding variables that affect each apple consumption and physician visits, making it troublesome to see an actual relationship if one exists.
Two key confounders to think about are GDP per capita and median age. Wealthier nations have higher healthcare programs and completely different dietary patterns, and older populations have a tendency to go to docs extra usually and should have completely different consuming habits.
To regulate for this, we alter our mannequin by introducing these confounders:
![](https://towardsdatascience.com/wp-content/uploads/2025/02/1_sbgE9FXGeZJfaNQ5UkjgQg-1-1024x627.png)
Information sources
The info comes from:
![](https://towardsdatascience.com/wp-content/uploads/2025/02/1_8_TzM7i6R2aNprgjhJGxLw-1.png)
![](https://towardsdatascience.com/wp-content/uploads/2025/02/1_8_TzM7i6R2aNprgjhJGxLw-1-1.png)
OLS regression (with confounders)
After controlling for GDP per capita and median age, we run a a number of regression to check whether or not apple consumption has any significant impact on physician visits.
![](https://towardsdatascience.com/wp-content/uploads/2025/02/1_JacElFNEjj7kNTtgJzlZvQ-1-1024x500.png)
The outcomes affirm what we noticed earlier:
- The coefficient for apple consumption stays very small(-0.0100), which means any potential impact is negligible.
- The p-value (85.5%) remains to be extraordinarily excessive, removed from statistical significance.
- We nonetheless can’t reject the null speculation, which means we now have no sturdy proof to assist the concept consuming extra apples results in fewer physician visits.
Similar as earlier than, this doesn’t essentially imply that no relationship exists, however reasonably that we can’t show one utilizing the accessible knowledge. It might nonetheless be potential that the true impact is just too small to detect or that there are but different components we didn’t embrace.
One fascinating statement, nonetheless, is that GDP per capita additionally reveals no vital relationship with physician visits, as its p-value is 0.668 (66.8%), indicating that we couldn’t discover within the knowledge that wealth explains variations in healthcare utilization.
However, median age seems to be strongly related to physician visits, with a p-value of 0.001 (0.1%) and a optimistic coefficient (0.4952). This implies that older populations have a tendency to go to docs extra steadily, which is definitely not likely shocking if we give it some thought!
So whereas we discover no assist for the apple fantasy, the info does reveal an fascinating relationship between growing old and healthcare utilization.
Median age → Physician visits
The outcomes from the OLS regression confirmed a robust relationship between median age and physician visits, and the visualization under confirms this development.
![](https://towardsdatascience.com/wp-content/uploads/2025/02/1_tSJRxjNGMDJIC2J0eiO1gQ-1.png)
There’s a clear upward development, indicating that nations with older populations are likely to have extra physician visits per capita.
Since we’re solely taking a look at median age and physician visits right here, one might argue that GDP per capita may be a confounder, influencing each. Nonetheless, the earlier OLS regression demonstrated that even when GDP was included within the mannequin, this relationship remained sturdy and statistically vital.
This implies that median age is a key think about explaining variations in physician visits throughout nations, unbiased of GDP.
GDP → Apple consumption
Whereas circuitously associated to physician visits, an fascinating secondary discovering emerges when trying on the relationship between GDP per capita and apple consumption.
One potential clarification is that wealthier nations have higher entry to contemporary merchandise. One other risk is that local weather and geography play a job, so it might be that many high-GDP nations are situated in areas with sturdy apple manufacturing, making apples extra accessible and inexpensive.
In fact, different components might be influencing this relationship, however we received’t dig deeper right here.
![](https://towardsdatascience.com/wp-content/uploads/2025/02/1_bQb7OA73r8kp2N9S2VKZkw-1.png)
The scatterplot reveals a optimistic correlation: as GDP per capita will increase, apple consumption additionally tends to rise. Nonetheless, in comparison with median age and physician visits, this development is weaker, with extra variation within the knowledge.
![](https://towardsdatascience.com/wp-content/uploads/2025/02/1_4h-xkAl6mjh_DXKDM6L1Eg-1.png)
The OLS confirms the connection: with a 0.2257 coefficient for GDP per capita, we will estimate a rise of round 0.23 kg in apple consumption per capita for every improve of $1,000 in GDP per capita.
The three.8% p-value permits us to reject the null speculation. So the connection is statistically vital. Nonetheless, the R² worth (0.145) is comparatively low, so whereas GDP explains some variation in apple consumption, many different components probably contribute.
Conclusion
The saying goes:
“An apple a day retains the physician away,”
However after placing this fantasy to the take a look at with real-world knowledge, the outcomes appear not in step with this saying. Throughout a number of years, the outcomes have been constant: no significant relationship between apple consumption and physician visits emerged, even after controlling for confounders. Evidently apples alone aren’t sufficient to maintain the physician away.
Nonetheless, this doesn’t utterly disprove the concept consuming extra apples might scale back physician visits. Observational knowledge, irrespective of how effectively we management for confounders, can by no means totally show or disprove causality.
To get a extra statistically correct reply, and to rule out all potential confounders at a degree of granularity that might be actionable for a person, we would wish to conduct an A/B take a look at.
In such an experiment, contributors could be randomly assigned to 2 teams, for instance one consuming a hard and fast quantity of apples day by day and the opposite avoiding apples. By evaluating physician visits over time amongst these two teams, we might decide if any distinction between them come up, offering stronger proof of a causal impact.
For apparent causes, I selected to not go that route. Hiring a bunch of contributors could be costly, and ethically forcing individuals to keep away from apples for science is unquestionably questionable.
Nonetheless, we did discover some fascinating patterns. The strongest predictor of physician visits wasn’t apple consumption, however median age: the older a rustic’s inhabitants, the extra usually individuals see a physician.
In the meantime, GDP confirmed a light connection to apple consumption, probably as a result of wealthier nations have higher entry to contemporary produce, or as a result of apple-growing areas are typically extra developed.
So, whereas we will’t affirm the unique fantasy, we will supply a much less poetic, however data-backed model:
“A younger age retains the physician away.”
In case you loved this evaluation and wish to join, yow will discover me on LinkedIn.
The complete evaluation is obtainable on this pocket book on GitHub.
Information Sources
Fruit Consumption: Meals and Agriculture Group of the United Nations (2023) — with main processing by Our World in Information. “Per capita consumption of apples — FAO” [dataset]. Meals and Agriculture Group of the United Nations, “Meals Balances: Meals Balances (-2013, previous methodology and inhabitants)”; Meals and Agriculture Group of the United Nations, “Meals Balances: Meals Balances (2010-)” [original data]. Licensed below CC BY 4.0.
Physician Visits: OECD (2024), Consultations, URL (accessed on January 22, 2025). Licensed below CC BY 4.0.
GDP per Capita: World Financial institution (2025) — with minor processing by Our World in Information. “GDP per capita — World Financial institution — In fixed 2021 worldwide $” [dataset]. World Financial institution, “World Financial institution World Improvement Indicators” [original data]. Retrieved January 31, 2025 from https://ourworldindata.org/grapher/gdp-per-capita-worldbank. Licensed below CC BY 4.0.
Median Age: UN, World Inhabitants Prospects (2024) — processed by Our World in Information. “Median age, medium projection — UN WPP” [dataset]. United Nations, “World Inhabitants Prospects” [original data]. Licensed below CC BY 4.0.
All pictures, except in any other case famous, are by the writer.