New Evidence Suggests We May Have Been Wrong About The Origin of Life.
Looking back roughly four billion years to a turbulent early Earth a stormy world of intense chemical activity where the first traces of life began emerging. For decades, scientists envisioned the genetic code the remarkable system that translates DNA’s four-letter alphabet (A, C, G, T/U) into the 20 amino acids essential for all living cells as evolving in a gradual, orderly sequence: simpler amino acids appeared first, followed by more complex ones, with the large, aromatic amino acid tryptophan joining last due to its structural complexity and rarity in prebiotic chemistry simulations.
A study published in December 2024 in Proceedings of the National Academy of Sciences (PNAS) challenges this conventional view. Researchers analyzed ancient “molecular fossils” embedded in the proteins of LUCA (the Last Universal Common Ancestor) the hypothetical single-celled progenitor from which bacteria, archaea, and eukaryotes (including all complex life) descend. Their findings paint a picture of a far more chaotic and competitive origin for the genetic code.
The team, led by doctoral student Sawsan Wehbi from the University of Arizona and senior author Joanna Masel, focused on short, conserved protein domains reusable functional modules that predate modern full-length proteins and serve as evolutionary time capsules. By reconstructing ancestral amino acid usage in these domains (dating back to or before LUCA), they inferred recruitment patterns more directly than previous indirect methods.

The results upend the traditional timeline. Smaller, simpler amino acids were indeed incorporated early, aligning with some expectations. However, when controlling for molecular size, the long-standing “consensus” order (based on metrics like abiotic synthesis experiments) loses much of its explanatory power.
Most strikingly, ultra-ancient duplicated and diverged sequences show significant enrichment in aromatic amino acids tryptophan (W), tyrosine (Y), phenylalanine (F), and histidine (H). Tryptophan, previously considered the “last” amino acid to enter the code, appears more frequently in these pre-LUCA-like proteins than in LUCA’s average proteome.


For example, in one ancient domain (related to a tRNA synthetase ancestor, PF00133), tryptophan frequency reaches 3.1% higher than typical LUCA levels with conserved positions suggesting these structures existed before modern precise charging enzymes. This implies early life may have used incomplete, alternative, or even noncanonical codes, possibly in a “peptide world” of short chains experimenting freely before the 20-amino-acid standard became fixed.

The emerging view is one of intense competition among multiple proto-codes in diverse micro-environments (e.g., sulfur-rich alkaline hydrothermal vents, which may have favored aromatics for UV protection or structural roles). Through horizontal gene transfer and selection, one robust code prevailed, absorbing useful innovations while eliminating rivals. Our universal genetic code is thus not a linear, inevitable outcome but the lone survivor of an evolutionary contest.

Early Life Theories: Primordial Soup
This has exciting implications for astrobiology. Early reliance on sulfur-containing amino acids (cysteine and methionine) points to hydrogen-sulfide-rich environments conditions that could exist today in subsurface oceans on moons like Enceladus or Europa, or ancient Mars. If aromatic-rich building blocks thrived in vent-like settings, similar chemistry might be occurring in extraterrestrial oceans, potentially detectable via sulfur isotopes or aromatic biosignatures.
By prioritizing ancient protein domains over whole genomes or prebiotic simulations, the study avoids longstanding biases in earlier models. Its statistical comparisons across hundreds of domain families offer not just a revised recruitment order but glimpses into the wild, experimental phase before the code stabilized.
Ultimately, this work is more than a timeline adjustment it’s a reminder that life’s history includes brilliant experiments and forgotten variants. The aromatic “rings” that once dominated pre-LUCA proteins hint at paths not taken on Earth, and perhaps still unfolding on distant worlds.
References
- Wehbi, S., et al. (2024). Order of amino acid recruitment into the genetic code resolved by last universal common ancestor’s protein domains. Proceedings of the National Academy of Sciences, 121(52), e2410311121.
- Miller, S. L. (1953). A production of amino acids under possible primitive earth conditions. Science, 117(3046), 528–529.
- Trifonov, E. N. (2000). Consensus temporal order of amino acids and evolution of the triplet code. Gene, 261(1), 139–151. (Referenced as the prior consensus benchmark in the PNAS study.)
- Moody, E. R., et al. (2024). The last universal common ancestor between ancient Earth chemistry and the onset of genetics. Nature Ecology & Evolution. (Related LUCA domain work cross-referenced in the primary study.)



