Supplementary Datasets for Amino acid and codon usage explain amino acid misincorporation rates across the tree of life

DOI

Protein translation is an error-prone process resulting in a random population of altered protein sequences in every cell. Here, we analyzed thousands of publicly available mass spectrometry datasets to detect amino acid misincorporations and quantify error rates in 14 model organisms. We find that overall error rates and the patterns of codon to amino acid error rates correlate across species. We estimate that on average 0.5-3% of protein molecules in a cell harbor a misincorporation, whereas this proportion can reach 10% for very long proteins. Highly expressed and very long proteins have lower error rates, indicating evolutionary selection on codon usage to reduce the cost of translation errors. While both codon-anticodon mispairing and tRNA mischarging contribute to misincorporations, we estimate that ~70% of misincorporation events are due to mispairing. The more frequent an amino acid in the proteome, the more likely it is misincorporated (r = 0.53), likely because frequent amino acids are abundant in the cell, increasing the rate of mischarging, and have abundant tRNAs, leading to increased mispairing. Overall, we find that amino acid and codon usage explain error rates. The conserved patterns of amino acid misincorporations from bacteria to humans suggest universal mechanisms driving translational fidelity.

Identifier
DOI https://doi.org/10.17617/3.WCUCRH
Metadata Access https://edmond.mpg.de/api/datasets/export?exporter=dataverse_json&persistentId=doi:10.17617/3.WCUCRH
Provenance
Creator Toth-Petroczy, Agnes
Publisher Edmond
Publication Year 2026
Funding Reference Max Planck Gesellschaft
OpenAccess true
Contact tothpet(at)mpi-cbg.de
Representation
Language English
Resource Type Dataset
Version 1
Discipline Other