Twenty Nine Proteins to Disrupt the World

games that covid plays

The covid genome is a hundred thousand times smaller than the human genome and encodes a measly twenty-nine proteins, compared to something closer to a million in humans.

Despite this, covid has created a pretty significant inconvenience... to its great misfortune. If there's one thing a species doesn't want is for humanity to get annoyed with it: we have a tendency to wipe out species we like, much less ones we don't.

Furthermore, we're going to learn a vast amount in the process of wiping out covid, because how viruses work is an instruction manual for what is required to operate the levers of the machinery of life.

Of the twenty-nine proteins covid makes, just four are "structural": two make up the outer membrane of the virus, another forms the armature its single strand of RNA is wound around, and the final one makes up the infective spike it uses to enter our cells. Most of the others are involved in getting the virus to replicate properly in the cell, and a handful of "accessory proteins" seem to be responsible for avoiding the host's immune response. For example, in its pre-infective state the virus is coated with chains of simple sugars that hide it from the immune system, and it is likely the accessory proteins have something to do with this.

A majority of the other non-structural proteins are created in an interesting way: rather than produce them individually, they are generated in two huge "poly-proteins" that are then cleaved into smaller bits by a protease (pronounced PRO-tee-aze, not pro-tease, which, uh, sounds like a different kind of biology entirely…) This protease enzyme is also encoded by the viral genome. Things with an "-ase" ending in biology tend to be destructive, so a prote-ase literally means "protein-breaker".

It used to be thought that the genome was a simple one-to-one template for proteins, and we'd have as many genes as there were proteins. Instead it turns out that processes like this, where there is "post-translational modification" of a gene product, is more common than not. As well as cleavage, post-translational modification can include changing the way a protein is folded. In other cases it involves sticking smaller proteins together.

Building things up out of simpler units is one of the most basic tricks that evolution has. Maybe there were at one time organisms that took a "holistic" approach to biochemistry where complex proteins weren't built out of simpler peptides that were built out of even simpler amino acids, but instead every biologically active molecule was its own thing, with no obvious internal boundaries between repeated units of the same kind.

Maybe we'll even encounter life-systems based on such a scheme when we explore other worlds around other stars. Such a biology would make organisms highly resistant to parasites and pathogens and even predators: covid and other viruses can infect us because we all share the same basic building blocks and the same genetic machinery. In a holistic ecosystem organisms could eat each other for the energy content of their carbohydrates, but would have to make all their own proteins from scratch. Parasitism would be impossible, which would make evolution radically different.

Unfortunately, we don't live in that world, so a lousy twenty-nine proteins is enough machinery to wreak havoc on our cells.

Once assembled, any protein or enzyme operates by a mix of geometry and chemistry: for one protein to bind to another, they have to have complementary chemically active sites in with mirrored geometries. Think of the active sites as being on the tips of your fingers, with left and right complementing each other chemically. If you hold both hands with the fingers making the same pattern, you can bring them together so that each tip touches its mate on the other hand. Success! You have created a bound protein! But if you hold them in different patterns, they don't mate, at least not fully, and whatever chemistry was supposed to happen, mostly doesn't.

In the case of covid (technically SARS-CoV-2), the spike protein has a component (called S1) that complements the active sites on the ACE2 protein that is part of the outer membrane of many cells, particularly in the lungs, intestine, heart, and kidneys. ACE2 is an important component of the system that regulates blood pressure (if you have high blood pressure you may be familiar with "ACE inhibitors", which help reduce it.)

One of the basic problems for viruses is that active sites on proteins tend to be, well, active, and since a good part of the virus life cycle involves floating around in the world there is a significant risk their active sites might end up binding with random junk in the environment, which would inactivate them. To avoid this, covid protects the active site on the spike protein by keeping it folded out of sight until it is needed. The folded spike protein is held in position by two hydrocarbon strands that have to be cleaved, and which are cleaved by human proteases. This ensures the active site doesn't become active until it's in the presence of a host.

Once it binds with ACE2, another part of the spike protein comes into play and fuses the viral membrane with the cell membrane, creating an opening that allows the viral RNA to pass into the interior of the cell and start its unpleasant work.

The details of how the viral RNA reprograms the cell to make copies of the virus are still being worked out, but the amount of research money and intellectual focus being thrown at this problem is such that progress is being made very rapidly.

The viral genome and the twenty-nine proteins it's responsible for creating are becoming a laboratory to understand important aspects of cellular function, pointing the way toward what works and what doesn't.

Between mRNA vaccine research and work on the virus itself, we're on the threshold of a new and powerful understanding of the most basic mechanisms of life.


Sources:

Early report on spike protein mapping: https://www.livescience.com/coronavirus-spike-protein-structure.html

Useful high-level breakdown on various covid proteins and their actions: https://cen.acs.org/biological-chemistry/infectious-disease/know-novel-coronaviruss-29-proteins/98/web/2020/04

Lots of detail on covid spike protein structure and behaviour: https://www.nature.com/articles/s41594-020-0468-7

Excellent technical review of covid infection process with attention to potential drug targets: https://www.nature.com/articles/s41401-020-0485-4