The names given to new coronavirus variants and bacteria can be difficult to use or understand. Using a pre-generated list of names would be better, says Mark Pallen
3 March 2021
SOME names for microbes, like Salmonella, trip off the tongue. Others, like Myxococcus llanfairpwllgwyngyll-gogerychwyrndrobwllllantysilio-gogogochensis, aren’t so easy to say. Labels for coronavirus variants fall somewhere between these two extremes, with code names like 20I/501Y.V1, B.1.429 or CAL.20C that allow coronavirus researchers to talk to peers, but leave the rest of us tongue-tied. As a result, most people find it easier to use geographical names, like the South African variant or the Kent variant, which may not be accurate and unfairly places blame on the people in those locations. Can we do better?
For naming living things, we tend to use a system created in the 18th century by Swedish naturalist Carl Linnaeus, which gives each species a two-word Latin name. Examples include the name Homo sapiens for humans and Escherichia coli for a common gut bacterium. Use of a dead language nicely brings neutrality and gravitas to the process.
But, despite the millions of bacterial species out there, so far only around 20,000 have been given Latin names. This deficit is getting worse thanks to the boom in DNA sequencing, which has revealed thousands of new species in the human gut alone.
Faced with the flood of new species, most microbiologists don’t have the time to come up with well-formed Latin names. Instead, they use alphanumerical placeholders such as UBA6965 or sp000063525 that are just as bad as coronavirus code names in terms of usability.
So, how are we to cope with the need for new names, whether for bacteria or viruses? Perhaps we can learn from the way storms are named. Authorities agree on a set of arbitrary forenames each year, like Francis, which are dished out in alphabetical order for each new storm.
Last year, working with a nomenclature expert and a programmer, I hit on a similar idea for bacteria. Instead of naming each organism as it is discovered, we could create a bank of names in advance and make that available to microbiologists who discover microbes.
Although new names are conventionally handcrafted, to generate enough of them, we could automate the process using a computer program to combine word roots from Latin and Ancient Greek to create linguistically correct names.
Names for bacterial genera are typically built from two or three such roots strung together in a row. So, in naming a bacterium found in, say, chicken faeces, one might string together roots for “chicken-faeces-microbe” to create Cottocaccomonas.
A breakthrough came when I realised that one could apply a combinatorial approach to this problem, so 10 terms for “chicken or bird”, 10 terms for “gut or faeces” and 10 terms for “microbe” used in all possible combinations could generate a thousand new names for microbes. Raiding Latin and Greek dictionaries to feed our program paid off handsomely with the creation of over a million new names for bacteria.
Which brings us back to the challenge of naming coronavirus variants. Here, I think we can follow the storm-naming approach even more closely and create a bank of names that say nothing about the properties of the variant and are unambiguous and easy-to-use.
To add gravitas, neutrality and an air of familiarity, one could raid the classical world for personal names of mythological or historical characters and then mix and match components to create an abundance of options. I am not sure whether that is going to fly with all the relevant stakeholders. But one thing is certain: we need to find a way around this problem as soon as we can and put an end to the blame game.
More on these topics: