Evolution of DNA - Improved Gene Regulation
Introduction
First Protein Transcription
First Genetic Replication
First Feedback
Puddle Evolution
First Dispersal & Evolution
First Parasite
First Organism
First Cell Metabolism
First Self-Sufficiency
Aromatic Assistants
First Assimilation
First Transfer Molecules
Eight Molecule Life
Complementary Base Pairs
Energy Sources
Conquering the Oceans
First Cells
Cellular Explosion
Gene Regulation
Chromosomes
First DNA
Introns
Wider Reading Frames
Complementary Triplets
Cellular Scripts
The Spread of Foxy
Second Parasite-- Transposons
First Schism
Improved Gene Regulation
Cell Structures
Eukaryote Explosion
Multi-Cellular Scripts
Cambrian Explosion
Epilog
Appendix 1-- Prebiotic Earth
Appendix 2-- Primordial Puddles
Appendix 3-- Primordial Catalysts
Appendix 4-- C Value Enigma
Cast of Characters

































































































































We eukaryotes have large cells that include all sorts of organelles (cell structures). We live long and varied lives, and frequently assemble into multi-cellular organisms. Our metabolism is regulated up the wazoo, with gazillions of control proteins and clever messenger molecules, ducts and vacuoles, chains of enzymes and pretty much you name it. We are definitely the Swiss Army knives of living organisms.

Just like rich people anywhere, we eukaryotes bring in hired servants to do our dirty work. Early on we absorbed some smallish bacteria to handle our basic metabolism (now called mitochondria), and the more plant-like among us absorbed some cyanobacteria to photosynthesize (now called chloroplasts). We'd rather spend our evolutionary time on more sophisticated things like Golgi bodies, eye spots and big, roomy brains.

As you might expect, the higher level of structural organization means that we eukaryotes also push Foxy to its limits, and consume large quantities of genetic data to specify the complex ways we are put together (only about 1.5% of the human genome codes for proteins).

Because of our increased levels of detail, we eukaryotes have developed a few additional systems for regulating genes. Let's take a look at them now, from a Foxy point of view.

More Organized DNA

As eukaryotes became more complex, their DNA also grew in length. For example, a typical yeast (one of the smallest eukaryotes) contains about four times as long a DNA strand as a typical bacteria (it has 12 million base pairs vs about 4 million).

The longer strand would have been more fragile, and any regulator protein that needed to find a specific gene would have to spend four times as much time looking for the right ID sequence.

Clearly, the cells needed a better way to organize all that extra information. You might say that gene ID provides a 'Dewey Decimal System' for the genome, but it was now time to build it into a library and put all that info onto shelves.

DNA Wrappers

Modern eukaryotes wrap their DNA around histone proteins, and it seems likely that system came into use a very long time ago. The main evidence for that is that histones are among the most highly conserved proteins in our genome, meaning that they have an extremely similar amino acid sequence in all modern eukaryotes.

Each 'wrapped' unit of DNA is called a nucleosome-- it has 147 base pairs wrapped around the nucleosome, then 20 to 60 base pairs in a relatively open stretch that connects to the next nucleosome. The 'linker' region is protected with a smaller histone protein, but presumably it is more open to access by regulator proteins. The open spots would be a great place to place any ID sequences (although there doesn't seem to be any evidence in the scientific literature whether that actually happens).

Linkages between the nucleosomes compact the DNA chain still further. The DNA remains in the compacted state until it is replicated during mitosis, or when some promoter decides to untangle a gene so it can be expressed via RNA polymerase.

To help identify individual genes, it seems likely that there would have been an ID sequence external to the main strand, or else markers on ID sequences within the chromosome to help an incoming gene-finder protein to locate the ID sequence.

A study of current scientific articles did not turn up any references to short RNA chains associated with histones or transcription factors in the nucleus (as would probably be the case if there were an external ID sequence). However the nucleosomes reside in an environment that is rife with loose RNA, and a few more short RNA chains would hardly be noticed.

Chromosomes

Eukaryotes usually split their DNA among several different chromosomes. Each chromosome is located in its own physical region within the nucleus.

It appears that related genes on different chromosomes frequently end up close to each other within the nucleus-- so there is probably some sort of management of the physical position of each chromosome and each gene within the nucleus (an ideal system to have under scripted control).

The DNA in some portions of the chromosome is densely packed into 'heterochromatin', which generally contains genes that are currently inactive. The cell marks inactive genes by methylating the cytosine nucleotides in certain parts of the gene sequence, by methylating a lysine molecule in the histone complex, and by binding heterochromatin protein 1 (HP1), which blocks access to the transcription factors which help initiate the creation of mRNA.

More active parts of the chromosome are called 'euchromatin'. It is more loosely packed, with decreased methylation. Euchromatin is frequently located near the nuclear pores, so the mRNA created from it can be delivered more easily to the rest of the cell.

Centromeres

With multiple strands of DNA, eukaryote cells returned to an old problem-- they now had to worry about getting one copy of each chromosome into each daughter cell, every time a cell divided.

The solution was the elegantly complicated process of mitosis, where the chromosomes replicate, compress into compact strands, and then separate into each daughter cell with the help of contractile spindle fibers.

Each chromosome contains a centromere, a specialized portion of the DNA sequence that is designed to connect to the spindle fibers.

Since mitosis is a process where location and timing are very important, it seems an ideal candidate for scripted control via some Foxy proteins and scripts.

The Nucleus

The most obvious difference between prokaryotes and eukaryotes is the nucleus-- a separate membrane that separates the DNA from the rest of the cell.

In modern eukaryote cells, the nuclear membrane is a tight barrier with 'gated' pores which regulate the passage of RNA, proteins and other materials flowing into and out of the nucleus.

That might seem like a physical barrier that would slow down the creation of proteins from mRNA. However the pores are large and the distance is not that far-- since DNA that is actively being transcribed to mRNA is usually located just inside of the pores, and ribosomes that translate the mRNA into proteins are located just outside the pores.

Within the nucleus are several sub-structures: including the nucleolus (where ribosomes are synthesized from RNA genes), separate regions for each chromosomes, and 'lanes' where RNA and other nuclear contents can quickly diffuse or flow between different parts of the nucleus.

Of course, the ideal way to control all of the details of the nuclear layout would be by Foxy scripts.

Helper RNA

We mentioned earlier that much of the early need for 'helper RNA' would have disappeared, once cells could build proteins from a full range of amino acids. The aromatic nucleotides in RNA were no longer necessary for enzyme action, once proteins could include aromatic amino acids to do the same thing. And the positioning chains that helped to assemble several small enzymes together would have been less vital, once cells could assemble large enzymes from thousands of amino acids, and combine all the catalytic needs in one enzyme.

The 'keep it simple' bacterial cells gradually lost introns, as the helper chains they carried were no longer needed.

However, Eukaryotes actually added more introns, instead of losing them (they now average about 7 introns per gene). Most likely that was because they needed new types of helper chains, to help manage the complexities of cell structures and cell metabolism.

The new style of helper RNA might have acted as a Foxy script, or as a 'gene ID' marker for a transposon that fetched a Foxy script. It might also have acted as a guide to tertiary folding of proteins, or coordinated with other genes with the help of a gene ID sequence. It's also likely that other functions arose for intron chains.

Improved Introns

We've already talked already about some different methods that early organisms used to remove helper RNA from the main protein-coding strand of mRNA. Using self-splicing introns and transposons to remove helper RNA from genes was an elegant system, plenty good enough for the relatively small and simple organisms of 3 billion years ago (and still good enough for modern prokaryotes).

However, as eukaryotes developed more sophisticated forms of helper RNA, they also needed a better way to manage them. The 'self popping' intron system was just too simple, and they needed a more talented way to deliver each length of helper RNA to its proper location.

The evolutionary answer was a 'manager complex' to pre-process messenger RNA and remove introns while still in the nucleus. It's called a 'spliceosome'. Spliceosomes are an interesting mixture of proteins and enzymatic RNA chains (some of which come from introns themselves!)

When eukaryotes are ready to synthesize a protein, they start by copying a strand of messenger RNA (mRNA) from the master DNA gene. The spliceosome then reads along the RNA strand while it is still in the nucleus, cuts out each intron, and wraps it into a lariat shape that is similar to the Type II self-splicing introns.

The end result is that eukaryote introns do not need to be 'self-splicing'. The spliceosome recognizes them, and sends them on their way with whatever processing they may require.

Some scripts might be delivered to Foxy-based gene regulator proteins within the nucleus. Others might be attached to the complex containing the remaining exon portions of the mRNA chain, and transferred out of the nucleus to a ribosome so it can be transcribed into an actual protein (with associated helper chains).

The whole process of splicosome formation and its interaction with the gene is still poorly understood. It's possible that each type of intron includes a marker sequence, that tells the spliceosome what to do with it-- send it off to fetch a transposon, link it to the protein as a folding aid, or whatever.

Alternate Splicing

Once there was spliceosome control over introns, there is at least one other advantage that they provided, particularly for more complex eukaryotes.

Because the protein-coding portions (exons) of most genes were already interrupted with several introns, it would have been possible to treat the different exons as separate 'building blocks', and assemble different combinations of them into different arrangements that might produce two or more functional enzymes from the same gene.

That notion would be particularly useful if each exon consisted of a 'functional group' that had some sort of useful, generic property on its own. That would make enzyme assembly not unlike building the same stock parts into different cars.

Combining more than one protein in a single gene has some advantages-- it reduces the amount of genetic material needed, and reduces the number of places a lethal mutation might occur.

It also has a serious disadvantage-- an evolutionary change in the gene will affect several proteins at once. That might mean that a positive change for one protein would be a negative change for another one coded by the same gene. Such a linkage could seriously slow down the pace of evolution in that species.

Other Intron Functions

Eukaryote genes contain an average of about 7 introns. Each gene also includes various regulator areas before and after the gene itself.

We have already mentioned several uses for RNA segments coded from intron DNA-- as guidance for tertiary folding, as a way to link multiple enzymes, as a way to link to other genes, as scripted data for use by Foxy proteins, and as a way to code multiple proteins on one gene.

It's certainly possible that introns serve even more functions that have evolved since Eukaryotes became more complex.

Custom Genes

It's even possible that some Foxy derivative played a role in the evolution of DNA genes themselves.

There are two place this might happen: one very practical, and one very far-fetched.

Exon Management

The first place where scripted 'gene management' would be extremely plausible is in the arrangement of exons into multiple enzymes (discussed early under 'alternate splicing').

Let's imagine a protein (we'll call it Doxy) which reads a script with its elbow, and then positions RNA fragments with its knee. As a spliceosome chops out the introns from a gene, Doxy could read a script, and position each exon into a different part of the final mRNA sequence (or perhaps skip it entirely).

That way, one gene could create several different proteins by simply rearranging the different parts of the polypeptide chain.

As with all of the scripted evolution, it would probably be quicker to re-arrange genes into interesting new combinations, rather than wait for amino acid mutations to accomplish the same thing.

It's likely that most exon-arranging would put the same amino acid sequence into different orders (to accomplish that, the exons would all need to include a number of base pairs that was evenly divisible by 3). However to be very fancy, an exon change might introduce a frame-shift change (shifting the reading frame by one or two units) and completely change the amino acids that were included .

Experimental Enzymes

Now it's time to fly totally into the Twilight Zone. Let's imagine a version of Doxy that reads a script, and assembles intron or transposon fragments into entirely new gene sequences-- and then uses a reverse transcriptase to convert them back to DNA and add them to the genome.

It's a way to create entirely new genes, in a way that may be significantly more successful than waiting for random mutations to occur. The new genes probably would have an increased chance of doing something 'interesting', since they already included working components from other genes, just in a scrambled order.

Sure, any organism doing that would have a significant chance of creating some kind of deadly protein. But then again, it's a risk that any organism takes when it allows any type of change to occur in its genes, whether by accidental mutations or deliberate ones.

As with any gambling strategy, there is always a chance that the risky genetic move would create some tremendous new property that would allow that organism explode into a huge population. That might be a worthy risk to take under adverse conditions, or just when a cell was feeling lucky.

A Doxy-based system of evolution might speed up the evolution of new genes enormously, as compared to plain old natural mutations. Of course some random selection mechanism for choosing gene fragments might also be successful, but adding a script to the mix might just allow for the evolution of a particularly successful selection mechanism that would have a higher than usual degree of success.

Evolution could have happened perfectly fine without Doxy's help, since crossing over and transposons are also available for mixing gene components into interesting new combination, so we won't consider it any further when talking about later evolutionary history. Just the same, the possibility of 'self programming' genetic scripts is very interesting, and it would certainly add some curious twists to the path of evolution.