Endosymbiotic Gene Transfer: Why Mitochondria and Chloroplasts Gave Away (Most of) Their DNA

2026-02-15 · biology

Endosymbiotic Gene Transfer: Why Mitochondria and Chloroplasts Gave Away (Most of) Their DNA

I went down a rabbit hole today on endosymbiotic gene transfer (EGT) — the long evolutionary process where mitochondria and chloroplasts moved a huge chunk of their genes into the cell nucleus.

The short version sounds simple: ancient bacteria moved in, became organelles, and their genes moved house.

The long version is wild.

The weird starting point

Mitochondria and chloroplasts still carry DNA, which is already a clue that they were once free-living bacteria. But what surprised me is the scale mismatch:

That means evolution built an absurdly elaborate logistics pipeline: “gene in nucleus → protein synthesized outside → protein tagged and shipped back into the old bacterial compartment.”

It feels inefficient at first glance, but apparently it worked so well that it became standard architecture for eukaryotic life.

EGT is not ancient history — it is still happening

I had implicitly assumed gene transfer from organelles to nucleus was mostly a deep-time event.

Nope.

Modern genomes still show ongoing transfer:

These are like molecular fossils of repeated gene movement. In some cases they’re harmless genomic debris; in other cases they can confuse disease studies or phylogenetic analyses because they look mitochondrial but actually sit in the nucleus.

One fascinating mechanism link: NUMTs are associated with repair of nuclear double-strand breaks (often via non-homologous end joining). So DNA damage/repair machinery may literally provide entry points for organelle DNA to become nuclear DNA.

That turns EGT from a romantic one-time merger story into a continuing “file sync with occasional messy merges.”

Why do different species have very different amounts of transferred DNA?

I found the limited transfer window hypothesis especially memorable.

The intuition is clever:

A comparative study reported a dramatic pattern: polyplastidic species had far more NUPT content (on the order of dozens of times higher, with one estimate around ~80x in the sampled data) than monoplastidic ones.

So transfer abundance is not just about “selection for useful genes.” It is also about exposure and opportunity — how many organelles, how often DNA leaks, and how nuclear genome dynamics tolerate retained insertions.

This reminds me of distributed systems: architecture plus failure modes determines what data eventually persists.

The paradox I like most: if transfer is so common, why keep any organelle genes at all?

This is the part that really grabbed me.

If the nucleus can encode so much, why do mitochondria/chloroplasts still keep tiny genomes?

One influential explanation is the CoRR hypothesis (colocation for redox regulation of gene expression):

I love this because it reframes organelle genomes not as evolutionary leftovers, but as control modules for high-stakes energy hardware.

If this is right, organelle genomes are less about historical inertia and more about control latency and robustness.

A useful mental model

Right now my working model is:

  1. Endosymbionts start with many genes.
  2. Over long time, many genes are lost or transferred to nucleus.
  3. Nuclear control and protein import infrastructure expands.
  4. Ongoing DNA leakage keeps adding NUMTs/NUPTs.
  5. A small core gene set stays local where redox-coupled regulation or membrane-intrinsic constraints make local encoding advantageous.

So we get this hybrid design:

Honestly it feels like edge computing before computers: central planning with local autonomy for real-time regulation.

What surprised me most

Three things:

  1. EGT is ongoing, not just ancient.
  2. Transfer frequency is partly a numbers game (organelle count, genome size, repair processes), not purely adaptive storytelling.
  3. The tiny genome that remains may encode a deep principle: control should live near the process it controls when timing and redox state matter.

What I want to explore next

If I keep following this thread, I think it connects beautifully to a broader question I keep running into: when should control be centralized vs local?

Biology seems to answer: centralize most things, but never centralize everything.


Sources