Evolution of DNA - Aromatic Assistants
Introduction
First Protein Transcription
First Genetic Replication
First Feedback
Puddle Evolution
First Dispersal & Evolution
First Parasite
First Organism
First Cell Metabolism
First Self-Sufficiency
Aromatic Assistants
First Assimilation
First Transfer Molecules
Eight Molecule Life
Complementary Base Pairs
Energy Sources
Conquering the Oceans
First Cells
Cellular Explosion
Gene Regulation
Chromosomes
First DNA
Introns
Wider Reading Frames
Complementary Triplets
Cellular Scripts
The Spread of Foxy
Second Parasite-- Transposons
First Schism
Improved Gene Regulation
Cell Structures
Eukaryote Explosion
Multi-Cellular Scripts
Cambrian Explosion
Epilog
Appendix 1-- Prebiotic Earth
Appendix 2-- Primordial Puddles
Appendix 3-- Primordial Catalysts
Appendix 4-- C Value Enigma
Cast of Characters
















































































Well, now that we have invented Cassius, it's time for a reality check. Unfortunately, Cassius has a very serious biochemical flaw, and the problem is so severe that Cassius probably will need more than just seven chains and seven proteins, in order to really build its own raw materials.

So far we have imagined that all of those enzymes were assembled from a chain of just two amino acids, since that is all that Fred can manage to build.

Unfortunately, the notion of creating large numbers of functional proteins from just two amino acids is an extremely shaky bit of theory . There is a reason why modern life uses 20 different amino acids as the 'building blocks' within proteins. They represent a wide range of sizes, charges and chemical properties. That diversity gives polypeptides many chemical and structural options, and increases the chance that they'll be able to act in some useful way.

With a reasonable amount of luck, polypeptide chains with two amino acids could have built a Fred, Roscoe and Nathaniel, since they are all simple molecules that really don't need to do all that much. Fred and Roscoe could have managed their chain-reading and polymerizing functions just by taking advantage of the attractions and repulsions between hydrophobic and polar regions in the polypeptides, and in the molecules they affected. And Nathaniel was just an ordinary structural molecule that could be built from anything.

However, we can't expect entire life forms to arise from just leucine and glutamate, or whichever two amino acids happened to make up the first Fred. The four new enzymes that we need to create a self-sufficient Cassius were probably just not possible to arrange . In fact, it's quite likely that our two-molecule proteins couldn't do any kind of serious enzyme action, since they weren't capable of performing three chemical actions that are very important to synthetic activity-- namely, moving electrons, donating protons, and providing energy.

Fortunately, there is an elegant solution, and any Caleb that stumbled upon it would have gained an enormous selective advantage over its cousins.

So far we have talked about aromatic chains such as Sofia and Sorrel as being strictly a passive 'blueprint' for the production of polypeptides or short proteins. Fred would 'read' each backbone molecule, and use it to code for a specific amino acid in a polypeptide. Basically, each chain served the same function as a modern gene, only with a reading frame of one molecule, rather than three.

However, aromatic chains contained clever molecules that could have also served some entirely different roles in the biology of our very early organisms. Let's look more closely at those other chain tricks, now.

Enzyme Chains

We've already mentioned how polypeptides create functional enzymes by positioning several amino acid molecules close together so they form an 'active group' which interacts with other chemicals.

With just two amino acids, Caleb only had so many options for 'designing' enzyme sites. However, it did have two additional chemicals nearby-- its aromatic chain molecules. The odds are good that they had some chemical properties that were not available in either of Caleb's amino acids. In fact, most aromatic rings are guaranteed to have some interesting chemical properties which the simpler amino acids could never have.

That whirling cloud of pi electrons shifts easily, which means that aromatic compounds like purines and pyrimidines are often very good at adding or removing electrons from other molecules. 'Electron management' is something that Caleb's two amino acids were missing, and it's usually a very important component in enzyme actions.

Many aromatic compounds are also good at shifting between two similar chemical states, which means they can temporarily borrow an atom or two from some other molecule, and then donate it at a new location. This 'proton management' is also an important part of many enzymatic reactions.

Aromatic chain molecules are also very good at temporarily storing energy, since their chemical bonds are already in a flexible, 'limbo' state, and it doesn't take much to shift them into more than one stable position. They can store a small jolt of energy for a while, and then release it to do something useful.

Using chain molecules as a component in enzymes definitely expanded Caleb's options, chemically speaking. A combination of chains and proteins was much more likely to produce successful metabolic enzymes.

For example, if Cassius happened to contain one of the purines or pyrimidines found in modern RNA, it could have used adenosine phosphates to store energy temporarily, cytosine as a proton donor , or guanine to help with electron transfers within the active group of an enzyme.

The chain molecules also could contribute their more rigid structure as a 'design tool' for building effective physical structures. Their rather inflexible shape would have been a good complement to the more floppy structure of the amino acids .

All in all, there would have been enormous advantages for Caleb to use aromatic chains directly as a component in its enzymes.

Protein-Chain Combos

How would a combined protein-chain enzyme work? Well, all Caleb really needed to do is link a chain to a polypeptide. For example, this illustration shows a central Fred-like polypeptide that is connected to a short 'helper' chain, which forms two 'wings' extending out on either side.

At the junction of the chain and the protein, there are places where the chemically active portions of the chain molecules and amino acids are close together, forming an 'active group', similar to the ones formed by a purely amino acid enzyme. In that region, there are four types of molecules that can work together-- at least doubling the possibilities for chemical actions.

In addition, the 'wings' might help the enzyme to guide precursor molecules into the enzyme, and then guide the finished product out. They might also interact with other portions of the polypeptide-- helping them to stay in a stable position, or shift conformations, or do something else that is interesting.

Protein-Chain Connectivity

A protein-chain complex would work fine as long as a short stretch of polypeptide can bond to a short stretch of chain. Fortunately, that is not a big challenge, since the amino acids in Fred and Roscoe already interacted well with chain molecules.

In fact, amino acids in general are happy to hang out with purines and pyrimidines. Cytosine (a nucleic acid) bonds extremely well to threonine and serine (two simple amino acids), guanine bonds well with arginine and lysine, and thymine has an affinity for lysine . So it wouldn't have been hard at all to combine bits of chain into an enzyme.

It's quite possible that the first catalytic enzyme developed from a mutant form of Roscoe or Fred that just happened to grip a chain fragment more tightly than usual, and then used it as part of a synthetically active group of molecules.

Helper Chain Evolution

If a chain was useful to Caleb as a helper, selective pressure would have eventually added it as a standard part of the Caleb complex, just the same as a chain that created a useful polypeptide.

Roscoe would still copy the 'helper' chain, and Nathaniel would grab it, and connect it with the rest of the molecules. Any Caleb with that new helper chain would survive better, and increase in number, exactly the same as if the chain were a gene that created a useful new protein enzyme.

The first helper chains may have been entirely new aromatic chains, or they may have been portions of existing genetic chains like Sofia, Sorrel or Serena, that could serve two functions-- once as a carrier for protein structures, and once as a direct enzyme component.

Multiple Combinations

Combining a polypeptide with some helper chains would have offered an additional design advantage to a Caleb or Cassius. By attaching different chains to the same polypeptide, it is possible to create two entirely different enzymes, thanks to the different shapes and chemical properties of the two chain molecules.

For example, our sample combination protein might bind to a slightly different chain, and end up with entirely different enzymatic properties, because of the new aromatic chain molecules in its active group.

Creating multiple chain-based enzymes is 'cheap' in an evolutionary sense. Only a few chain molecules need to mutate to form a brand new enzyme, which is much faster than waiting for mutations to shape a new protein to accomplish the same thing.

In other words, evolution can occur directly in the helper chain molecules, without need to be processed through a protein.

Metabolic Efficiency

Using chains as part of enzymes was also metabolically more efficient, since there was no need to transcribe the chain molecules into a protein for them to work. It might only take a few chain molecules to add a property that would require dozens of amino acids-- so a chain-based Caleb could build more of itself from the same quantity of raw materials.

Saving a few molecules would have been extremely important for the earliest versions of Caleb, since they didn't yet have the ability to create any of their own ingredients. A 10% drop in molecules required meant an 10% increase in the number of Calebs that could be produced from the stock of materials available in a micro-puddle.

Chain-based enzymes would also be very advantageous if an evolving Caleb developed enzymes that created chain molecules, before it had enzymes that created amino acids. In that case, chain molecules would be very abundant in the local puddle, while amino acids would be scarce. In that case, any Caleb that could use chains instead of the more 'expensive' proteins, would be more successful.

Tool holder Proteins

Since aromatic chains were better suited for most types of synthetic chemistry, it's possible that early Calebs developed only a small number of simple 'tool holder' proteins to use as enzymes. Such a protein would have multiple binding sites for short chains, with the ability to hold them together so they could form a catalytically active group.

A Cassius could then use the 'tool holder' protein for multiple functions, simply by loading it with different chain sequences that had different chemical properties.

Pure Chain Enzymes

Could Caleb have used enzymes that consisted entirely of chain molecules? After all, that is an important part of the 'RNA world' theory.

Well, probably not. There are two serious problems with that notion.

First of all, purine and pyrimidine chains really like to be in a straight line or a helix, since they are much less 'bendy' than amino acid chains. Just picture a stack of coins that are stuck together, and how difficult it would be to bend it into a loop.

In order to hold several active groups together so they can act catalytically, there needs to be some way to force the chain to bend around, so several parts of the chain can be close enough together to form an 'active group'.

Modern RNA enzymes accomplish that by using complementary base pairing. Portions of the chain fasten together, which forces the remainder of the chain into tight bends. If the shape is just right, several chain molecules will end up close together, so they form an 'active group' that acts as an enzyme.

That's fine for RNA, since it's built from four molecules that match up very well in complementary pairs.

However, right now Caleb has genetic chains that are built from just two random molecules, and the fancy A-T and G-C pairing in RNA is not yet an option. That means that there's no easy way to force the chain molecules to produce anything other than straight chains.

Secondly, even if there was a way to force the aromatic chains into position, chain-based enzymes would need to be much longer than protein enzymes. Because of their rigidity, it takes more molecules before an active group can 'bend around' and meet up with another part of the chain. A functional protein/chain combination enzyme might be able to work with as few as 20 amino acid molecules and a few chain molecules, while an enzyme built entirely from a backbone chain would probably need 60 or 80 molecules to create the same action.

At this stage, Caleb was still very limited, with just a few basic enzymes. It was not very talented at creating long aromatic chains, yet. Using small proteins with helper chains was about the 'cheapest' way to accomplish what it needed.

Positioning Chains

Caleb could have also used its genetic chains for an additional function-- to position several proteins or protein fragments into larger functional groups.

That means that a short, aromatic chain could serve two more roles as a 'helper' to a protein:

1. Within a single protein, the chain could position the polypeptide into a particular tertiary structure and keep it there, with a limited range of movement due to the relative rigidity of the backbone chain.

2. If the chain were long enough, it might attach to more than one protein, and then position multiple enzymes into a relatively fixed orientation .

For example, in this example a short backbone chain (at the bottom) is holding three different polypeptide enzymes together. By positioning them exactly right, the chain might hold them into the correct position for some kind of activity that they couldn't manage on their own. It's the difference between a catalyst, and a supercatalyst.

Temporary Backbones

Backbone chains might have been a permanent part of enzyme complexes, but they might also have acted as temporary positioning aids during the synthesis of polypeptides and proteins.

For example, a backbone chain might have 'guided' the position of a polypeptide right after it was assembled by Fred, and helped it to fold into a specific tertiary structure that was enzymatically active.

A chain might also have linked a newly assembled protein to an existing enzyme, and guided them each into a specific orientation or connection with respect to each other.

Such temporary backbones could have attached temporarily during protein synthesis, and then been reused for the next synthesis.

There probably would have been a selective advantage to have them attached directly to Fred so they would be right there to manage Fred's output.

Positioning Evolution

The 'puddle evolution' that would have allowed Caleb and Cassius to develop more effective protein-coding genes would have worked just as well for chains that increased the efficiency of existing genes, by positioning them properly.

Once a chain managed to produce a more effective enzyme by any means, the Caleb or Cassius that contained it would have a selective advantage, and the positioning chain would tend to be expand in distribution just the same as any protein-coding chains.

In other words, short backbone chains can offer a level of organization with their own evolutionary advantages in the early soup. Fixing the position of enzymes is another useful piece of information that is just as beneficial to pass to the grandchildren as the structure of an enzyme.

In fact, since position-coding chains are shorter and simpler, they probably evolved more quickly than the protein-forming parts ever could. Some of the best new enzymes might have consisted of existing polypeptides packaged into a different configuration, and backbones would have provided that kind of evolutionary change very easily.

Chains and Fred

Using chains as a part of cell chemistry seems so attractive, that you might wonder whether Fred and Roscoe might have taken advantage of them, even back in their early days.

Well, maybe. The first Fred probably didn't use a helper chain, simply because it had no way to replicate chains, and couldn't create functional offspring if they also required a copy. However, once Roscoe came on the scene, helper chains would have been usable and reproducible. They may have then played a role in the production of better versions of Fred and Roscoe, in later generations.

Helper Gene Markers

Unfortunately, there is one major problem with the use of 'helper chains'. Whenever Fred transcribed those sequences into a set of amino acids, it wasted some resources, and ran the risk of creating a toxic protein-- since the sequence was not really meant to code for a protein.

Of course, it's possible that some chains might have been dual purpose-- useful on their own, and also producing useful proteins when transcribed by a Fred. It would be extra-efficient to have genes that could work that way, but relying only on dual-purpose chains would severely limit the number of useful helper chains that could evolve. It would be much better to simply have Fred avoid transcribing the various helper chains into proteins.

Some early organisms may have found some clever form of Nathaniel that would keep the helper chains close to a Roscoe, and away from a Fred. But that seems rather hard to arrange-- once a new chain was created, how would Nathaniel know how to treat it?

A better approach would have been to mark the direct chain molecules somehow, so Fred would never copy them. Then there was no risk of accidental proteins, period.

Gene Headers

We talked earlier about the need to improve Fred, by giving it a reliable way to start transcribing chains at their very beginning. It's possible that some lucky Cassius stumbled onto a solution to both problems at the same time, by adding a short 'marker' or 'header' to the beginning of each genetic chain.

Right now, we'll be a little vague about the composition of this marker, or how it would work. It may have been a specific sequence of chain molecules, or it may have been some other, entirely different compound that was attached at the 'start' end of the chain.

A Cassius that had different headers for regular genes (to be transcribed into proteins by Fred) and helper genes (to be transcribed into helpers by Roscoe) would have enjoyed all of the advantages of the helper chains, and none of the disadvantages. Roscoe would still duplicate both types of genes, but Fred would only transcribe from actual protein coding chains.

The first 'gene marker' was probably very simple, but as we'll see later, it would gradually evolve into more and more sophisticated systems, as the needs increased.

In future chapters, we'll also see some further techniques that cells started to use, to distinguish protein-coding chains from the helpers. In fact, you might almost say that the management of 'other' DNA was the dominant issue for early organisms, at least in their first billion years, or so.