When you were conceived, you received ~3.2 billion base pairs of DNA—half from each parent. Within that DNA are ~4-5 million variants that differ from the reference genome.
Where did they come from?
Two sources:
Inherited variants (~99.98%): Already in your parents’ genomes, passed through sperm and egg, traveling through populations for generations
De novo variants (~0.02%): Brand new mutations—occurred for the first time in you, not present in either parent
Understanding this distinction is crucial for genetics, medicine, and evolution. This chapter explores both.
You received one chromosome copy from each parent at every position. At heterozygous sites (where a parent has two different alleles), you randomly got one or the other.
Example—CFTR gene position:
{@html `Mother: A/G (heterozygous) You inherit: Either A or G (50% each)
Father: G/G (homozygous) You inherit: G (100%)
Your possible genotypes:
→ A/G (50% chance)
→ G/G (50% chance)
`}
This is Mendelian segregation directly observable in DNA sequences.
When you inherit a DNA segment, you get an exact copy—identity by descent (IBD). This reveals relationships:
Relationship | % IBD Shared | Why |
---|---|---|
Parent-child | 50% | You got one chromosome copy from them at every position |
Siblings | ~50% (varies) | Each randomly got one of two parental chromosomes |
Grandparent-grandchild | ~25% | Two generations of 50% transmission |
First cousins | ~12.5% | Four generations from common ancestor |
Sibling variation explained:
At each position, you and a sibling each randomly received one parental chromosome:
Average: 50% IBD, but actual sharing varies (45-55% typical range)
Chromosomes don’t pass intact. During meiosis, recombination (crossing over) exchanges segments between the two parental chromosomes before transmission.
Result: Your mother’s chromosome 1 is a mosaic:
{@html `Position 1-50 Mb: From maternal grandmother
Position 50-200 Mb: From maternal grandfather
Position 200-250 Mb: From maternal grandmother
`}
Impact:
Feature | Common Variants (>1%) | Rare Variants (<1%) |
---|---|---|
Age | Ancient (thousands-millions of years) | Recent (hundreds-thousands of years) |
Selection | Survived selection → usually benign | May be deleterious, not yet eliminated |
Clinical | Complex trait risk factors | Often cause Mendelian diseases |
Example | ApoE4 (~15% frequency, Alzheimer’s risk) | Family-specific BRCA1 nonsense mutations |
De novo variants are mutations that occurred for the first time in you—not present in either parent.
Where they arise:
How many: ~50-200 per person, depending on sequencing technology and what’s counted
1. Source of new genetic variation
All diversity started as de novo mutations. Evolution in action.
2. Cause genetic diseases
Many severe developmental disorders:
3. Reveal DNA replication biology
Studying where and when mutations occur teaches us about replication errors and repair mechanisms.
Key finding: Father’s age at conception is the dominant factor in de novo mutation rate.
Why males contribute more mutations:
Female (Oogenesis) | Male (Spermatogenesis) |
---|---|
All eggs formed before birth | Sperm production starts at puberty |
Cells arrested until ovulation | Continuous production throughout life |
~23 cell divisions total | Divisions every ~16 days |
Minimal age effect | Strong age-dependent effect |
The math:
Each division = opportunity for replication errors.
Direct mutation counting in 78 Icelandic families revealed fundamental insights into human mutation rates (Kong et al. 2012, Nature).
Core findings:
Mutation rate: 1.20 × 10⁻⁸ per base per generation
Paternal age effect: +2 mutations per year of father’s age
{@html `Father age 20: ~55 mutations
Father age 30: ~75 mutations
Father age 40: ~95 mutations
Father age 50: ~115 mutations
`}
Parent-of-origin ratio: 4:1 (father:mother)
Mutation spectrum:
Disease associations:
A more comprehensive study from the same Icelandic research group analyzed 1,548 parent-child trios (20-fold larger than Kong 2012), identifying 108,778 de novo mutations (Jónsson et al. 2017, Nature). This larger dataset confirmed the paternal age effect and revealed important new findings about maternal contributions.
Updated findings:
Paternal mutations: +1.51 per year of father’s age (95% CI 1.45–1.57)
Maternal mutations: +0.37 per year of mother’s age (95% CI 0.32–0.43)
Mutation spectrum changes with maternal age:
Regional variation:
Parent-of-origin ratio: 75-81% paternal (3-4:1 ratio, consistent with Kong 2012)
For a visual comparison of paternal vs. maternal age effects across all mutation types, see Figure 1e in Jónsson et al. (2017).
Key implications:
The discovery of maternal age effects and regional mutation hotspots fundamentally changed our understanding of human mutation. While paternal age remains the dominant factor (due to continuous cell divisions in spermatogenesis), maternal age also contributes—particularly for certain mutation types and in specific genomic regions. The regional C>G enrichment suggests distinctive mutational processes in aging oocytes, possibly related to double-strand break repair mechanisms.
Recent study measured the most accurate rate of de novo variants (including all variant types) using long read sequencing technologies (Porubsky et al. 2025, Nature).
New findings:
Higher total count: 98-206 mutations/child
75-81% paternal contribution (confirmed with more precision)
16% postzygotic mutations (arose after fertilization)
Mutation hotspots: 32 specific regions with recurrent mutations
No recombination-mutation link (independent processes)
Figure: De novo mutations in CEPH 1463 pedigree. The figure shows: (a) Number of de novo germline mutations, postzygotic mutations (PZMs), and indels in G2 parents and G3 children, (b) Allele balance comparison showing germline SNVs cluster near 0.50 while postzygotic SNVs show lower allele balance (<0.25), (c) Strong paternal age effect for germline de novo SNVs (+1.55 DNMs per year) but not for PZMs, (d) Estimated SNV DNM rates by genomic region, showing significant excess in repetitive regions including centromeres and segmental duplications. Source: Porubsky, D. et al. (2025). Human de novo mutation rates from a four-generation pedigree reference. Nature. https://doi.org/10.1038/s41586-025-08922-2. License: CC-BY 4.0.
Paternal age considerations:
Father’s Age | Additional Risk |
---|---|
<30 | Baseline mutation rate |
30-40 | +20 mutations (~35% increase) |
40-50 | +40 mutations (~70% increase) |
>50 | +60+ mutations (>100% increase) |
Most mutations are harmless, but risk of specific conditions increases in certain disorders.
When a child has a disorder not present in parents:
1. Look for de novo mutations:
{@html `Sequence trio (child + both parents)
↓
Identify variants in child absent from both parents
↓
Prioritize: LoF or damaging missense in disease genes
↓
Validate: Check for mosaicism in parents
`}
2. Consider sequencing technology:
3. Check for mosaicism:
Standard workflow:
{@html `Sequence:
- Proband (affected child)
- Mother
- Father
Compare:
- Variants in child
- Variants in parents
De novo candidates:
- Present in child
- Absent in BOTH parents
- Pass quality filters
`}
Challenge: Sequencing errors can look like de novo mutations.
Filters applied:
Filter | Purpose | Typical Threshold |
---|---|---|
Read depth | Ensure reliable calls | >20× coverage |
Allele balance | Check heterozygous ratio | 30-70% variant reads |
Mapping quality | Avoid mismapped reads | MAPQ >30 |
Parental absence | Confirm not inherited | 0 variant reads in parents |
Validation | Confirm by Sanger or PCR | For clinical variants |
False positive sources:
Short-term (per generation):
Long-term (over millennia):
Today’s polymorphisms = Yesterday’s de novo mutations
The common SNPs distinguishing populations were once de novo mutations in individuals thousands of years ago.
As average paternal age increases:
{@html `1900s Iceland: Average father age = 34.9 years
1980s Iceland: Average father age = 27.9 years
2010s Iceland: Average father age = 33.0 years
`}
Consequence: Population-wide mutation rate fluctuates with demographics.
Debate: Does delayed reproduction increase disease burden?
Feature | Inherited Variants | De Novo Variants |
---|---|---|
Frequency | ~4-5 million per genome | ~50-200 per genome |
Source | Parents’ genomes | New mutations during reproduction |
Age | Can be ancient (millions of years) | Brand new (this generation) |
Detection | Standard variant calling | Trio sequencing required |
Inheritance | Follow Mendelian segregation | Not inherited (absent in parents) |
Clinical | Common → complex traits Rare → Mendelian diseases |
Severe developmental disorders Dominant de novo conditions |
Selection | Already filtered by selection | Haven’t faced selection yet |
Two sources of variants: Inherited (from parents, following Mendel’s laws) and de novo (new mutations in you)
Paternal age effect dominates: Father’s age is the main determinant of de novo mutation rate (~1.5 mutations/year), though maternal age also contributes (~0.4 mutations/year)
Males contribute more: 75-81% of mutations due to continuous sperm production with repeated cell divisions
Maternal mutations have unique patterns: Mutation spectrum changes with maternal age, with regional hotspots showing up to 50-fold enrichment
Technology matters: Long-read sequencing reveals mutations in repetitive DNA invisible to short reads, nearly doubling the total mutation count
Clinical importance: De novo mutations cause many severe developmental disorders; trio sequencing identifies them
Evolutionary engine: All genetic diversity started as de novo mutations; today’s polymorphisms were once new mutations
We’ve focused on de novo variants in individuals, but these mutations are also the raw material for evolution. Over long timescales, individual mutations accumulate to create species differences and even entirely new genes.
Scale | Timeframe | Example |
---|---|---|
Individual | This generation | Your ~50-200 de novo variants |
Population | 100-10,000 generations | ABO blood types, lactose tolerance |
Species | 100,000+ generations | Human-chimp fixed differences (~35M SNVs) |
New genes | 1-10 million years | Gene duplications, de novo gene birth |
1. Accumulation of point mutations
The same replication errors causing your de novo variants, accumulated over millions of years, create species boundaries (Montañés et al. 2023, Molecular Biology and Evolution). Humans and chimpanzees differ by ~35 million single-nucleotide changes, each starting as a de novo mutation in some ancestor.
2. Gene duplication creating new functions
Genes can be duplicated by unequal crossing-over or retrotransposition. Usually the duplicate copy degrades, but occasionally it evolves a new function. Classic example: the globin gene family (myoglobin, α-globin, β-globin, fetal γ-globin) all arose from ancient duplications of a single ancestral gene. Each duplication allowed functional specialization—fetal hemoglobin has higher oxygen affinity for placental transfer, while adult hemoglobin optimizes oxygen delivery to tissues.
3. De novo gene birth from non-coding DNA
Entirely new genes can emerge from previously non-coding sequences. Recent comparative genomics studies show many species-specific genes arose this way, though most (~80%) are rapidly lost. Survivors start small and gradually acquire function under selection (Montañés et al., 2023, Molecular Biology and Evolution).
The FOXP2 gene illustrates how point mutations can drive major phenotypic changes. Discovered in a British family with severe speech disorder, FOXP2 encodes a transcription factor critical for speech motor control (Atkinson et al. 2018, Cell).
Human FOXP2 differs from chimpanzee by only two amino acids (at positions 303 and 325). Originally thought to be a recent selective sweep enabling modern human language, genomic studies revealed these changes were already present in Neanderthals, indicating an older origin (>300,000 years ago). While no evidence exists for recent selection, these ancient changes likely contributed to vocal motor learning in the human lineage (Atkinson et al., 2018, Cell).
Recent work identified additional speech-related genes (CHD3, SETD1A, WDR5) through whole-genome sequencing of children with speech disorders, revealing that speech acquisition involves a regulatory network of genes co-expressed in the developing brain.
Your de novo mutations:
Your ancestors’ de novo mutations:
The DNA replication errors adding ~1.5 mutations per year of your father’s age (and ~0.4 per year of your mother’s age) are the same molecular process that, over evolutionary time, created all genetic diversity—from individual differences to the origin of new species.
You now understand where variants come from—inherited or arising de novo—and how they connect to evolution across all timescales. The next chapters explore:
Chapter 10—Dominant Inheritance: Why does one mutant copy sometimes cause disease? You’ll see three molecular mechanisms explaining dominance.
Chapter 11—Haploinsufficiency: Why are some genes sensitive to losing one copy? Population-scale metrics predict gene constraint.
Chapter 12—Recessive Inheritance: What happens when both copies are mutated? How does family structure reveal recessive alleles?
Understanding transmission is the foundation. Now we examine the patterns those transmitted variants create—the dominant and recessive inheritance that Mendel observed, now explained at the molecular level.