The Missing Heritability Problem: Are We About to Overturn 100 Years of Research?

Nearly a century of scientific research, based on hundreds of studies of twins (identical and fraternal), has been thrown into question. The foundations of what we know — or think we know — about heritability are at stake. Are we about to overturn 100 years of research? This is the controversy that has followed from the "missing heritability problem," which is a puzzle that has pitted researchers against one another in one of the more intense debates of modern science. We’re going to explore it in this article.

What is Heritability?

First, let's break down what ‘heritability’ really means. Many people assume that Heritability measures how "genetic" rather than "environmental" a trait is - and that's not completely wrong, but the truth is more subtle than that. Heritability measures how much of the variation in a trait — such as height — within a specific group of people can be explained by genetic differences. For example, if scientists say that the heritability of height is 70% in a certain population, they’re saying that 70% of the variation in height among people in that group is due to their genes.

However, even if that is true for a specific group, and you're a member of that group, that doesn’t mean that 70% of your height is determined by your genes, and the other 30% by something else. Heritability is about groups, not individuals. So, in a population where height has a heritability of 70%, most of the variation in height among different people is due to genetic differences, but this doesn’t tell us exactly how much of one person’s height is due to their genes.

It’s also crucial to understand that high heritability doesn’t mean that a trait is fixed or unchangeable. For instance, height is usually considered highly heritable, yet it's well known that it is impacted by environmental factors as well, such as nutritional deficiencies and childhood infections. Even a trait that is entirely heritable - for instance, a rare genetic disorder - could come to be changable through medical technology.

Also, a trait having low heritability doesn’t mean that it has nothing to do with genetics. Suppose that we lived in a world where all humans were born with a specific gene that gave us three hands. In that case, having three hands would be caused by genetics (it's a trait due to a gene) and the only way to lack three hands in such a world would be due to environmental effects (e.g., losing one of the three hands in an accident). In such a world, since heritability measures the proportion of variation in a trait that is due to genetic differences, and there are no genetic differences causing variation in hand numbers (everyone has the same gene causing three hands), the heritability of having three hands would be zero. This shows us that there's a sense in which all of our traits are "genetic" - our genes are what make us human rather than mice, insects or plants - but they are not all highly heritable.

So, as we can see, heritability measures how much of the variability of a trait is explained by genetics in a particular population at a particular time, not whether a trait is changeable, or even whether genes determine a trait.

What is the Missing Heritability Problem?

A classical and very commonly used way to estimate the heritability of a trait is to compare how similar that trait is among identical twins relative to how similar it is among fraternal (non-identical but same-sex) twins.

The logic is that since identical twins have ~100% of their genes in common, whereas fraternal twins have only ~50% (considering just those genes that are not typically shared between all humans), if identical twins are more similar to each other for a trait than fraternal twins, it’s probably due to genetics.

For those who like to know a little of the math involved: this can be formalized with Falconer’s formula, which says that if the correlation of a trait (say, height) is ri for identical twins and rf for fraternal twins, then, under certain assumptions:

heritability = 2 * (ri - rf)

This means that if the correlation between pairs of identical twins for a trait is the same as the correlation between pairs of fraternal twins for that trait, then ri = rf, so heritability is estimated to be 0.

If we use the method of comparing identical and fraternal twins, we get heritability results like those shown below, which Spencer Greenberg compiled from academic papers:

Notice how, according to twin studies, not only are heritabilities substantial for things people usually think of as being "genetic" (like eye color and height), but they are also substantial for personality traits, mental health disorders, body mass index, and even political views.

However, there is another, very different way to estimate heritability, which the revolution in DNA sequencing technology has made possible. Rather than comparing the traits of twins, people’s genetics can be measured directly, and then traits can then be predicted from their DNA.

The more accurate the DNA-based predictions are, the more heritable a trait is (all else equal).

If both approaches are trying to measure what amounts to the same thing, we'd expect their estimates to be similar. But, in the early days of these new methods, the heritability estimates were extremely low compared to the twin studies, which was puzzling.

As DNA technology has improved, the statistical approaches for making these predictions have advanced, and the data sets to train these algorithms on have grown in size, the DNA based methods have predicted higher heritabilities than before.

But here’s the problem: even with all these advancements, the DNA-based methods still usually predict much lower heritabilities than the twin studies - often less than half of what the twin studies say! This is called the “Missing Heritability” problem.

See, for instance, the charts below that show various DNA based methods and the much lower heritability estimates they provide than twin studies do. DNA based methods often estimate heritabilities to be 50% (or even less) of what is found via twin based methods!

Source: http://gusevlab.org/projects/hsq/

And:

Source: https://doi.org/10.1371/journal.pgen.1008222

Why would the different approaches lead to such different results? Well, both approaches rely on assumptions. For instance, the twin-based method assumes that:

The shared “environment” for identical twins is not more similar than the environment for fraternal twins
That there is not a substantial amount of “assortative mating” - where parents end up more similar than expected by chance because they seek out traits that they have, such as people with college degrees seeking out others with college degrees
That genes do not substantially change the probability of being exposed to different environments

On the other hand, the DNA methods make their own assumptions. These assumptions depend on the exact method applied, but a common one is that there aren’t “rare genetic variants” that contribute substantially to heritability.

It is plausible that the earliest DNA methods were producing underestimates of heritability. The techniques have improved a lot since then, but, like twin methods, they still have to make some assumptions.

So what should we say, in the face of this problem?

The stakes are high. If the DNA-based methods are accurate, then almost a century of research based on twin studies might need to be reconsidered. But it's not that simple. The controversy has given rise to multiple schools of thought, each fiercely defending its position.

Three Sides of the Debate

What has emerged from this controversy is a debate with three main sides, regarding the heritability estimates of traits:

The Twin Study Advocates

One group argues that twin studies remain the best model we have. They believe that the limitations of DNA methods — such as the constraints of current technology and statistical models — are the reasons for the discrepancy. They point out that as these methods have improved, the heritability estimates have risen, suggesting that perhaps with further advancements, the DNA-based estimates might eventually align with those from twin studies.

The DNA Proponents

Another group argues that twin studies have inherent flaws that overestimate heritability. They suggest that the lower estimates from DNA-based methods might actually be more accurate. This camp often cites concerns that twin studies make assumptions that are probably not true, such as that identical twins don't have a more similar environment than fraternal twins. ,

The Middle Ground

A third perspective suggests that both approaches have their own flaws — twin studies may overestimate heritability, while DNA-based methods may underestimate it. According to this view, the truth likely lies somewhere in between, and it’s possible that the missing heritability problem will remain unsolved until we develop even more sophisticated methods of analysis.

So, as we stand on the precipice of potentially overturning decades of research, there is a divide in the scientific community. Hopefully, high-quality science will do what it is made to do, and a new consensus will emerge. But if DNA based methods turn out to be more accurate, nearly 100 years of heritability estimates may need to be thrown out!

If you want to challenge your own understanding of the world, and find your own misconceptions, why not try our Common Misconceptions Quiz. You’ll be asked to separate fact from B.S. in a fun quiz that asks you to identify the misconceptions among 30 common beliefs.

Launch the Common Misconceptions Quiz