Skip to content Skip to navigation

A rocky road from SNPs to heritability

Wanted H2 poster from http://genes2brains2mind2me.com
genes2brains2mind2me.com
Jan 5 2016

Posted In:

Research, Faculty, Students

Biology graduate student Siddarth Krishna Kumar and his advisor Professor Shripad Tuljapurkar, with Professors Marcus Feldman (Biology) and David Rehkopf (Medicine), have shown that Genome wide Complex Trait Analysis (GCTA) does not reliably estimate the heritability of complex traits. This work was recently published in the Proceedings of the National Academy of Sciences.

Human traits – such as height, blood pressure or disease susceptibility – are the product of genes and environment. The effect of genes is measured by heritability –  in a population, the fraction of trait variation among individuals due to genes. Genome Wide Association Studies (GWAS) now can document millions of Single Nucleotide Polymorphisms (SNPs) in thousands of people. For many traits, GWAS identify hundreds of associated SNPs -- yet such associations often explain little of the trait variation among people.

In contrast a new and popular method, GCTA (Genome wide Complex Trait Analysis), typically finds high heritability for many complex traits. We show that GCTA yields unreliable estimates of heritability.

In the best case, GCTA’s many assumptions hold (unlikely, as discussed below). Even so, we show (mathematically) that GCTA’s estimates of heritability are likely to be unstable. We also use simulated data to show that the parameter estimates obtained from GCTA are sensitive to the SNPs used in the study. So even in this best-possible scenario, a GCTA heritability estimate is unreliable.

More generally, a central assumption in GCTA – that all individuals come from the same population – is almost certain to be violated, given the history of human migration. If the studied individuals come from many heterogeneous populations, the failure of GCTA is catastrophic—a fact that we demonstrate mathematically, and illustrate numerically using real data (from the Framingham Heart Study). We show that changes in the sample used for the study (people or SNPs), or errors in the trait values, produce large changes in the heritability estimated using GCTA.

Put simply, GCTA fails because it uses doubtful assumptions to transform an enormous amount of data into a few final parameters. We conclude that there is an urgent need to re-examine GCTA’s estimates of heritability, e.g., in medicine and social science, where it has been applied to issues like autism and childhood intelligence.