In 1976, psychologist Sandra Scarr and Richard Weinberg published the results of a study they had been conducting in Minnesota for several years — one of the most consequential and contested investigations in the history of behavioral science. The Minnesota Transracial Adoption Study examined Black and mixed-race children who had been adopted into white, middle-class families at an average age of about two years, and measured their IQ scores in childhood and again in adolescence. The study had a clear theoretical question: if the measured IQ gap between Black and white Americans was largely environmental in origin — a consequence of unequal educational resources, nutritional deficits, stress from discrimination, and material deprivation — then Black children raised in advantaged white households should show substantial IQ gains relative to children raised in less advantaged conditions.
The results supported that hypothesis, at least initially. In the first wave of measurement, transracially adopted children showed IQ scores substantially above what would be predicted from population norms for Black children, and in some analyses, well above the white population mean. Scarr and Weinberg concluded that the data supported a primarily environmental explanation for group differences in test scores. But the study did not end there. A follow-up by Weinberg, Scarr, and Waldman in 1992 found that by late adolescence, the adopted children's IQ scores had declined toward baseline levels, narrowing many of the initial gains. Critics of the environmental interpretation cited this as evidence for genetic influence reasserting itself as children matured. Defenders pointed to methodological complications, selection effects in who was adopted, and the ongoing environmental disadvantages the children faced even within white households. The study generated decades of argument and is still cited on both sides of debates about genes, environment, and group differences in cognition. What it unambiguously established was that the question of how genes and environments interact to shape cognitive development is empirically complex, politically explosive, and not easily resolved by any single study, however carefully designed.
That complexity is behavioral genetics in miniature: a field with sophisticated methodology, important findings, and a history so entangled with racism and eugenics that even careful scientists have to navigate contested terrain just to explain what their methods actually measure.
"Heritability is a statistic about populations, not a verdict about individuals. The same trait can be 80% heritable in one environment and 20% heritable in another." — Eric Turkheimer, Current Directions in Psychological Science (2000)
Key Definitions
Heritability is the proportion of variance in a trait within a specific population at a specific time that is statistically associated with genetic differences among individuals in that population. It is a dimensionless number between 0 and 1 (often expressed as a percentage), and it is a population statistic, not a statement about individuals or about how much any person's trait is "caused" by their genes.
Monozygotic (MZ) twins are identical twins who arise from a single fertilized egg that splits into two embryos. They share essentially 100% of their segregating genetic variants, making them the closest thing to a natural genetic clone.
Dizygotic (DZ) twins are fraternal twins who arise from two separately fertilized eggs. They share, on average, 50% of their segregating genetic variants — the same genetic similarity as any pair of full siblings.
Shared environment refers to aspects of the environment that make siblings raised together more similar to each other — the home, parenting style, neighborhood, and school quality experienced in common.
Non-shared environment refers to aspects of the environment that differ between siblings raised in the same family, including idiosyncratic peer groups, unique life events, birth order effects, and measurement error.
Gene-environment correlation (rGE) describes the tendency for genetic predispositions and environmental exposures to be statistically correlated rather than independent — for example, genetically musical people being more likely to encounter music-rich environments.
Polygenic score is a summary measure of an individual's genetic predisposition to a trait, calculated by weighting and summing thousands or millions of genetic variants identified in genome-wide association studies.
GWAS (genome-wide association study) is a research design that tests millions of genetic variants simultaneously across large samples for statistical associations with a phenotype of interest.
The Logic of Twin and Adoption Studies
Behavioral genetics rests on a set of elegant natural experiments that nature provides, most importantly the existence of twins and the existence of adoptive families. The twin study design compares the phenotypic similarity of identical twins (who share essentially all their DNA) with the similarity of fraternal twins (who share half). If identical twins are significantly more similar on a trait than fraternal twins, this implicates genetic factors in producing that similarity. If identical twins raised apart are still more similar to each other than fraternal twins raised together, the evidence for genetic influence is stronger still.
The adoption study design compares the similarity between adopted children and their biological parents (with whom they share genes but not environment) versus their adoptive parents (with whom they share environment but not genes). If adopted children resemble their biological parents more than their adoptive parents on a trait, genetic factors are indicated. These designs make specific quantitative assumptions that allow variance in a trait to be decomposed into three sources: additive genetic variance (A), shared environmental variance (C), and non-shared environmental variance (E). This ACE model is the foundation of quantitative behavioral genetics.
The Equal Environments Assumption
Twin studies rest on a critical assumption: that the environments of identical and fraternal twins are equally similar within each pair. If identical twins actually experience more similar environments than fraternal twins — because they are treated more alike, more often mistaken for each other, or seek out more similar companions — then the greater similarity of identical twins might partly reflect environmental rather than purely genetic causes. This is the equal environments assumption (EEA), and it has been extensively tested.
The general finding is that the EEA holds well enough not to seriously bias heritability estimates for most traits. Studies that directly measure environmental similarity find that more similar treatment of identical twins does not produce greater behavioral similarity once genetic factors are controlled. However, the EEA remains a theoretical vulnerability in the design, and critics periodically argue that it underestimates environmental influence for specific traits.
What Behavioral Genetics Has Found
Intelligence
General cognitive ability (commonly called g) is the most extensively studied trait in behavioral genetics. The major findings are remarkably consistent across many countries and samples.
In adult samples from high-income Western countries, heritability estimates typically range from 60 to 80 percent. In children, estimates are lower, typically 40 to 50 percent, with shared environmental factors accounting for a meaningful portion of variance in early childhood. The finding that heritability increases and shared environmental influence decreases across development is one of the most robust in the field. It is counterintuitive: as children grow up, move through school, and eventually leave home, the environmental influence of the family appears to diminish — genetic influences on the traits people select and shape for themselves becomes more pronounced.
Eric Turkheimer and colleagues published a crucial qualification in 2003 in Psychological Science. Among twins from low-socioeconomic status families, the heritability of IQ was approximately 10 percent; among twins from high-SES families, it was approximately 72 percent. Shared environment accounted for 60 percent of variance in low-SES twins and near zero in high-SES twins. This finding — now called the Turkheimer interaction — is theoretically important. It shows that the magnitude of genetic effects is not fixed but depends on the range of environments sampled. When environmental deprivation restricts development, genetic potential for cognitive ability cannot fully express itself. When environments are adequately nourishing, genetic differences between individuals exert greater influence on outcomes.
Personality
For the Big Five personality dimensions (openness, conscientiousness, extraversion, agreeableness, neuroticism), heritability estimates from twin studies typically range from 40 to 60 percent. A striking finding is that the shared family environment contributes little to personality similarity among adult twins — siblings raised in the same family end up no more similar in personality than would be expected from their genetic overlap alone. This does not mean family environment does not matter; it means that the aspects of family environment that make siblings different from each other matter more than the aspects that make them similar. The non-shared environment — idiosyncratic peer groups, random life events, unique developmental experiences — accounts for a large portion of personality variance in adults.
Psychiatric Disorders
Major psychiatric disorders show substantial heritability, with estimates ranging from approximately 40 percent for major depressive disorder to approximately 80 percent for schizophrenia and bipolar disorder. These are among the most replicated findings in behavioral genetics. The finding that schizophrenia is approximately 80% heritable (with some estimates higher still) yet was attributed to parenting style ("refrigerator mothers" and "schizophrenogenic families") for decades is one of behavioral genetics' most important corrections to clinical mythology.
GWAS studies for psychiatric disorders have been among the largest in behavioral genetics, involving hundreds of thousands of participants through the Psychiatric Genomics Consortium. These studies have identified hundreds of genetic variants associated with schizophrenia, bipolar disorder, ADHD, and autism, most with individually tiny effects. The polygenicity of psychiatric disorders — many variants, each contributing a small amount — has become one of the central findings of molecular behavioral genetics.
The Molecular Genetics Turn: GWAS and Polygenic Scores
Beginning in the early 2000s, the development of inexpensive genome-wide genotyping transformed behavioral genetics. Where earlier research could examine only a handful of candidate genes, GWAS now permits simultaneous testing of millions of genetic variants. The results have been surprising to some early observers: for behavioral traits, no common variant with large effect has been found for any complex psychological outcome. Everywhere researchers look, they find polygenicity — many variants, each explaining a fraction of a percent of variance.
This pattern has practical implications. Polygenic scores for educational attainment and cognitive ability, calculated from GWAS results, now explain 10 to 15 percent of variance in those traits in held-out samples, a figure that continues to increase as sample sizes grow. These scores are scientifically useful — they can identify genetically heterogeneous groups for more targeted research, and they can be used to control for genetic confounding in studies of environmental interventions. But they are not clinically useful for predicting individual outcomes and they perform poorly when applied to people of non-European ancestry, because GWAS discovery samples have been overwhelmingly European.
The Missing Heritability Problem
A puzzle that emerged from early GWAS work is the "missing heritability" problem: the genetic variants identified in GWAS explain far less variance than twin studies estimate should be genetically influenced. For height, twin studies suggest 80-90% heritability, but early GWAS explained only 10-15% of variance. Subsequent research has suggested that missing heritability is largely accounted for by rare variants, variants of small effect below genome-wide significance thresholds that collectively add up to substantial prediction, and possibly gene-environment and gene-gene interactions. The problem has become less acute as sample sizes have grown and methods have improved, but it remains an active research area.
Debates: Robert Plomin, Kathryn Paige Harden, and the Public Discussion
Behavioral genetics entered a new phase of public debate with two contrasting books published in 2018 and 2021. Developmental psychologist Robert Plomin, who spent his career building behavioral genetics into a mature science, published Blueprint in 2018, arguing that behavioral genetics had effectively proven that families matter very little for most psychological outcomes. Plomin's reading of the literature held that shared environment accounts for essentially zero variance in most psychological traits by adulthood, and that parents therefore have little influence on who their children become (in terms of personality and cognitive ability) beyond the genes they pass on. This conclusion was greeted with enthusiasm in some quarters and sharp criticism in others.
Behavioral geneticist Kathryn Paige Harden published The Genetic Lottery in 2021, making a different argument: that the findings of behavioral genetics should be taken seriously by progressives precisely because ignoring genetic influences leaves the policy space to those who would exploit it. Harden argues that people are born with different genetic endowments, that these endowments have real consequences for life outcomes, and that a just society should compensate for these arbitrary differences in the same way it might compensate for arbitrary differences in family wealth. Eric Turkheimer, one of behavioral genetics' most distinguished methodologists, engaged extensively with Harden's argument in published reviews, agreeing that the findings of behavioral genetics are real while questioning some of her specific interpretations and policy conclusions.
The History of Misuse and Why It Still Matters
The relationship between behavioral genetics and eugenics is not merely historical. The eugenics movement that dominated early twentieth-century biology and social science drew directly on the methods and findings of early behavioral genetics. Heritability studies were used to argue that intelligence was fixed and hereditary, that poverty was largely genetic in origin, and that social policy should manage reproduction rather than improve environments. Forced sterilization laws affected tens of thousands of Americans; Nazi racial hygiene programs killed hundreds of thousands more.
Arthur Jensen's 1969 paper in the Harvard Educational Review proposed that a 15-point IQ gap between Black and white Americans was partly genetic in origin. The subsequent controversy reshaped funding priorities, editorial policies, and the careers of many researchers. Richard Herrnstein and Charles Murray's The Bell Curve (1994) made similar arguments for a popular audience and renewed the controversy. An APA task force convened in 1995 to review the evidence on intelligence concluded that the causes of group differences in IQ scores remained unknown and unresolved — a conclusion that still accurately describes the scientific state of affairs.
Contemporary behavioral geneticists are acutely aware of this history and generally careful to distinguish findings within groups from claims about differences between groups. The distinction is not evasion; it is genuine methodological necessity. Heritability estimates based on within-group twin comparisons provide no information about what causes differences between groups. This is one of the most important things that behavioral genetics' methodological critics have gotten right.
Related Articles
For a broader discussion of how heredity and environment interact over development, see nature vs nurture. For what research shows about the structure and measurement of intelligence, see what is intelligence. For the behavioral science of stable individual differences in psychological traits, see what is personality.
References
- Scarr, S., & Weinberg, R. A. (1976). IQ test performance of Black children adopted by White families. American Psychologist, 31(10), 726–739. https://doi.org/10.1037/0003-066X.31.10.726
- Turkheimer, E., Haley, A., Waldron, M., D'Onofrio, B., & Gottesman, I. I. (2003). Socioeconomic status modifies heritability of IQ in young children. Psychological Science, 14(6), 623–628. https://doi.org/10.1046/j.0956-7976.2003.psci_1475.x
- Scarr, S., & McCartney, K. (1983). How people make their own environments. Child Development, 54(2), 424–435. https://doi.org/10.2307/1129703
- Plomin, R. (2018). Blueprint: How DNA Makes Us Who We Are. MIT Press.
- Harden, K. P. (2021). The Genetic Lottery: Why DNA Matters for Social Equality. Princeton University Press.
- Weinberg, R. A., Scarr, S., & Waldman, I. D. (1992). The Minnesota Transracial Adoption Study. Intelligence, 16(1), 117–135. https://doi.org/10.1016/0160-2896(92)90028-P
- Neale, M. C., & Cardon, L. R. (1992). Methodology for Genetic Studies of Twins and Families. Kluwer Academic.
- Jensen, A. R. (1969). How much can we boost IQ and scholastic achievement? Harvard Educational Review, 39(1), 1–123. https://doi.org/10.17763/haer.39.1.l3u15956627424k7
- Herrnstein, R. J., & Murray, C. (1994). The Bell Curve: Intelligence and Class Structure in American Life. Free Press.
Frequently Asked Questions
What is behavioral genetics?
Behavioral genetics is the scientific discipline that uses genetic research designs — primarily twin studies, adoption studies, and increasingly molecular genetic methods — to understand how genetic and environmental factors contribute to variation in psychological traits. The field does not ask whether genes or environment 'cause' a trait, since virtually every trait is caused by both. Instead, it asks what proportion of the variation in a trait across a population can be attributed to genetic differences between individuals, and what proportion can be attributed to environmental differences. This statistical decomposition is formalized in the concept of heritability, which is the field's central methodological tool. Behavioral genetics covers the full range of psychological outcomes: cognitive abilities like intelligence, memory, and executive function; personality dimensions like extraversion, conscientiousness, and neuroticism; psychiatric disorders like schizophrenia, bipolar disorder, major depression, and ADHD; behavioral tendencies like risk-taking, religiosity, political orientation, and even television watching habits. The field emerged in the mid-twentieth century from the intersection of quantitative genetics (which had developed mathematical tools for livestock breeding) and psychology, and has been transformed since the early 2000s by molecular genetic methods that allow researchers to directly measure millions of genetic variants in large samples. It is one of the most contested areas of science, partly because of genuine methodological disputes and partly because of its history of misuse in the service of eugenics, forced sterilization, and scientific racism.
What does heritability mean and what does it not mean?
Heritability is one of the most systematically misunderstood concepts in behavioral science. Heritability is defined as the proportion of variance in a trait, within a specific population at a specific time, that is statistically associated with genetic differences between individuals in that population. It is a population statistic, not a statement about individuals or about the causal contribution of genes to any particular person's trait value. Several common misunderstandings follow from this definition. First, heritability does not mean 'genetically determined.' A trait with 80% heritability is not 80% caused by genes in any individual; it means that 80% of the difference between people in that population, at that time, is associated with genetic differences. Second, heritability can change over time and across populations. The heritability of a trait in a wealthy Swedish sample may be very different from its heritability in a malnourished Ugandan sample, and neither figure applies universally. Third, high heritability is compatible with large environmental effects. The classic example is height: among well-nourished adult populations in wealthy countries, heritability of height approaches 90%; yet average heights in many countries have increased by 10 centimeters or more over the twentieth century purely due to improved nutrition and public health — an enormous environmental effect operating on a highly heritable trait. Fourth, heritability within groups says nothing about differences between groups. The proportion of variance within a group that is genetic is entirely independent of what causes differences between groups. This is the distinction that researchers have repeatedly emphasized in debates about race and IQ. Fifth, heritability estimates assume a given range of environments. If environments are restricted (everyone in a sample lives in very similar conditions), genetic differences will account for more of the remaining variance — not because genes matter more, but because environmental variation has been removed from the denominator.
What have twin studies found about intelligence?
Twin studies have been the primary tool for estimating the heritability of intelligence since the 1920s, and they have produced several findings that are robust across many independent samples, though contested in their interpretation. The basic design compares the similarity of identical (monozygotic, MZ) twins, who share essentially all of their DNA, with the similarity of fraternal (dizygotic, DZ) twins, who share on average half their segregating genetic variants. If identical twins are more similar in a trait than fraternal twins, genetic factors are implicated. For general cognitive ability (commonly called g or IQ), the major findings are as follows. Heritability estimates in adult samples in high-income Western countries typically range from 60 to 80 percent, with some studies finding heritability as high as 86 percent for older adults. In children, heritability estimates are lower, typically 40 to 50 percent, with the shared environment (aspects of the family environment that make siblings similar to each other) accounting for a larger share. This pattern — heritability increasing and shared environmental influence decreasing across development — is one of the most replicated findings in behavioral genetics and is counterintuitive: as people move through life and gain more control over their environments, they increasingly select and shape environments that match their genetic predispositions, amplifying genetic effects. Eric Turkheimer and colleagues published a crucial qualifying study in 2003 (Psychological Science) showing that this pattern reverses in low-SES environments: among families living in poverty, heritability of IQ was approximately 10 percent and shared environment accounted for 60 percent. Among affluent families, heritability was approximately 72 percent and shared environment near zero. This 'Turkheimer interaction' is one of the most important findings in the field, demonstrating that the magnitude of genetic effects depends critically on environmental context.
How does the environment interact with genes?
The relationship between genes and environment is more intricate than the simple question 'which matters more?' suggests. Developmental psychologist Sandra Scarr and Kathleen McCartney proposed an influential framework in a 1983 paper in Child Development describing three types of gene-environment correlation (rGE), meaning ways in which genetic tendencies and environmental exposures become correlated rather than independent. Passive rGE occurs because children share genes and environments with their biological parents: a genetically musical child is likely to have musical parents who also provide a music-rich home environment. The child receives both the genetic predisposition and the environmental stimulus from the same source, making it impossible to separate them cleanly. Evocative rGE occurs when a person's genetically influenced characteristics evoke particular responses from others: a genetically sociable child receives more social engagement from peers and adults, which further develops social skills. The environment is responding to genetically influenced characteristics. Active rGE (also called niche-picking) occurs when people seek out environments that match their genetic predispositions: a genetically intellectually curious adolescent seeks books, stimulating conversation, and academic environments, while a less curious peer does not. This active process becomes increasingly important as children gain autonomy, which helps explain why shared environmental influence on intelligence declines during adolescence. Beyond gene-environment correlation, there are also gene-environment interactions (GxE), where the same genotype produces different outcomes in different environments, and the same environment produces different outcomes in different genotypes. The Turkheimer interaction described in twin studies of poverty and IQ is a gene-environment interaction. Molecular genetic studies have begun to identify specific variants involved in such interactions, though the field is methodologically challenging and early GxE findings often fail to replicate.
What is a polygenic score?
A polygenic score (also called a polygenic index or PGI) is a numerical summary of an individual's genetic predisposition to a trait, calculated by adding up the estimated effects of thousands or millions of genetic variants (single nucleotide polymorphisms, or SNPs) across the genome. The weights used in the calculation come from genome-wide association studies (GWAS), which test millions of genetic variants simultaneously in large samples (often hundreds of thousands of individuals) for statistical associations with a phenotype. For a trait like educational attainment, a GWAS might identify thousands of variants each with tiny individual effects — the largest single variants found so far explain less than 0.1% of the variance in educational attainment. But summed into a polygenic score, these variants collectively explain a modest but meaningful portion of individual differences. Polygenic scores for cognitive ability and educational attainment are now predictive enough to be scientifically useful: they explain roughly 10 to 15 percent of variance in educational attainment in the populations where they were developed, which is more than most single socioeconomic predictors. They also show that GWAS hits for educational attainment overlap substantially with GWAS hits for cognitive ability, depression, and other traits, revealing a genetic architecture of complex psychological traits that spans many biological pathways. However, polygenic scores have important limitations. They are developed in primarily European-ancestry populations and lose predictive power when applied to people of different ancestries, because the pattern of genetic associations differs across populations. They predict group averages in research but are not diagnostically useful for individuals. And they capture genetic predispositions, not fixed destinies: a high polygenic score for educational attainment increases the statistical likelihood of higher educational attainment but does not determine it.
What is the history of behavioral genetics' relationship to eugenics?
The history of behavioral genetics is inseparable from the history of eugenics, and this history is not merely background context — it actively shaped which research was funded, which populations were studied, and which findings were amplified in public discourse. The eugenics movement, founded by Francis Galton in the 1880s (Galton coined the word 'eugenics' and was a pioneer of statistical methods still used in genetics today), proposed that human society should deliberately manage reproduction to improve the genetic composition of the population. This idea was enthusiastically adopted across the political spectrum in the early twentieth century, including by many progressive reformers who saw it as scientific social planning. The consequences were catastrophic. By the 1930s, more than 30 American states had enacted forced sterilization laws targeting people deemed mentally deficient, insane, or criminally inclined, resulting in the coerced sterilization of approximately 60,000 Americans. Nazi Germany's Rassenhygiene (racial hygiene) programs drew explicit inspiration from American eugenics, scaling coerced sterilization to over 400,000 individuals and eventually incorporating mass murder into the biological project. The postwar scientific consensus repudiated both the scientific claims (behavioral genetics had been badly corrupted by ideological motivated reasoning) and the ethical framework of eugenics. But the field could not cleanly separate itself. In 1969, educational psychologist Arthur Jensen published a paper in the Harvard Educational Review arguing that Black-white differences in IQ scores were partly genetic in origin. The scientific response was overwhelmingly critical; the methodology was inadequate and the conclusion unwarranted. In 1994, Richard Herrnstein and Charles Murray published The Bell Curve, reviving claims about race, IQ, and genetics in a popular audience book that became a political flashpoint. The American Psychological Association convened a task force that concluded the Black-white IQ gap was real and substantial but that its causes — including the contribution of genetic factors — remained unknown and contested. That remains the scientific consensus today. Contemporary behavioral geneticists like Kathryn Paige Harden argue in her 2021 book The Genetic Lottery that progressives make a mistake in refusing to engage with behavioral genetics, since ignoring genetic influences does not make them go away and leaves the field's findings to be weaponized by bad actors without effective scientific rebuttal.