Wednesday, September 25, 2013

RNA Takes First Place

Biology concepts – nucleic acids, DNA, RNA, central dogma of molecular biology, ribozyme, RNA world hypothesis


The Library of Congress in Washington DC was designed as a 
showplace as well as a repository. The main reading room looks as
much like a museum or a cathedral as it does a library. If I could
figure out how to get away with it, I would live in the LOC.
Did you know that there are more than 155.3 million informational items (books and such) in the Library of Congress? Established in 1800 with 3000 volumes, the library was originally housed in the Capitol Building. Unfortunately, all the books were lost when the British fired Washington in 1814. No worries, the LOC then purchased Thomas Jefferson’s personal library of over 6500 books and set up shop in new building, although not the 1892 designed library that exists today (left).

In a way, you can think of the molecular workings of the cell like the Library of Congress. You need information storage – these are the books. In each book (chromosome or parts of a chromosome) contain the instructions (genes) needed to make products (proteins) the cell may need.

Each time you want to make a certain molecule, you must consult the book (chromosome) that has the correct instruction page (DNA gene). But you may be making many copies of your product in a short period, so one book might not be enough.

You could keep many copies of each book, maybe thousands, but this would take up too much room. The LOC already covers 2.1 million sq. feet (and that’s just one main building). What if you needed 1500 copies of One Good Turn (and interesting book about the history of the screw and screwdriver) because at some time or another, 1500 people wanted to learn how to build a square screwdriver?

To avoid this need for extra space, you make copies of pages (mRNA) from the books (chromosomes) that can be taken out of the library (nucleus) and used for making the products. Each time you want a product, a translator (tRNA and ribosome) must be used. This converts the copied instructions (mRNA) into a usable product (protein).

When one or several translations have been made, the copied instructions start to tear and get worn, and finally break down. Good thing we still have the original copy of the book stored in the nucleus… I mean library. We can go back and make more copies later if we need them. Humans are amateurs, we only have about 25,000 sets of instructions stored in 46 books, nowhere near the 155.3 million of the LOC.


The central dogma of molecular biology says that DNA is replicated to
DNA, so daughter cells get a full set of instructions. DNA is also
transcribed to mRNA, which is a copied message of the instructions to
build one protein. Finally, the mRNA acts as a code that is translated
into an amino acid polymer – a protein. HIV and other retroviruses
laugh at the central dogma, going the opposite direction, RNA to
DNA. Retrotransposons laugh at HIV, as they can do all that and more.
Cells take this library/nucleic acid analogy further. Sure, they have DNA, mRNA, and tRNA so that they can carryout the central dogma of molecular biology --- DNA goes to mRNA goes to protein (via tRNA and rRNA), but they have so much more. Just as there are many kinds of information storage at the LOC--- books, images, recordings, manuscripts, pamphlets, there are different kinds of nucleic acids as well.

Ever here of small nuclear RNAs, or micro RNAs, or plasmid DNAs for that matter? We have talked about plasmids as extrachromosomal pieces of DNA that can code for genes, especially antibiotic resistance genes in prokaryotes.

But the list of RNAs is far more impressive. There are regulatory RNAs that control gene expression (whether or not a protein is made from a gene), RNAs that control modification of other RNAs or work in DNA replication. There are even RNAs that are parasitic, like some viral genomes (RNA viruses) and retrotransposons.

Of these, retrotransposons may be the most interesting. A transposon is a piece of DNA that can jump around from place to place in the chromosomes of a cell. Barbara McClintock won a Nobel Prize for identifying transposable elements were responsible for the different colors of corn kernels in maize.


Ancient viral RNA got inserted into plant and animal genomes. The
retrotransposon can be transcribed to mRNA, and then could be
reverse transcribed back into DNA or translated into protein. The
DNA can then insert itself anywhere in the genome. Since several
mRNA transcripts can be made from one transcribed retrotransposon,
and since several pieces of DNA can be reverse transcribed from just
one mRNA, we have the potential for millions of retrotransposons in
the genome – and that’s exactly what we have found. The bottom
cartoon shows HIV. Since reverse transcription makes more mistakes
than DNA replication, many more mutants can be produced. This is
one reason HIV is so hard to treat – it’s always changing.
Retrotransposons use the library analogy to fill the shelves with hundreds of copies of themselves. If plant nuclei were like libraries, up to 80% of their book pages would be retrotransposons!

In and of themselves, retrotransposons represent an exception in nucleic acids. They are mRNA sequences that can turn back into DNA. Transcription is the process of using DNA to produce an mRNA, so going the opposite direction is called reverse transcription. This is also what retroviruses like HIV do.

In the case of retrotransposons, the chromosome held copies will be transcribed to an mRNA, and some of those copies might be translated into protein. Other copies will be reverse transcribed back to DNA by an enzyme called reverse transcriptase and will insert themselves somewhere in the genome (see picture).

In this way, retrotransposons can make more copies of themselves and end up all over the chromosomes of the organism. Mutation occurs at a higher rate in reverse transcription than in DNA replication because reverse transcriptase makes more mistakes than replication enzymes. This is why HIV is so hard to treat; it mutates so often that drug design can’t keep up with the changes in the viral proteins.

So how can the same mRNA sometimes be translated, and other times end up in a new place on the DNA? A 2013 study has investigated how one type of retrotransposon manages these different outcomes. The BARE retrotransposon of plants has just one coding sequence for a protein, but the study results show that it actually makes three distinct mRNAs from this one piece of DNA.


Sam Kean is the author of The Violinist’s Thumb, a very readable
book on molecular biology. He goes through how fruit flies were
recruited to disprove DNA heredity and ended up as the strongest
evidence for it; how DNA is linked very strongly to linguistics and
math; and how Stalin tried to breed a race of half human - half
chimps. This is in addition to showing how most DNA on Earth is
descended from viruses.
One transcript (mRNA) is modified so it can be translated but cannot be reverse transcribed. The second transcript is packaged in small bundles to be reverse transcribed later back to DNA. The third transcript type is smaller and actually houses the bundles of mRNAs to be reverse transcribed. So this retrotransposon balances itself between making protein and inserting itself into new places in the genome.

If plants have so much nucleic acid in the form of retrotransposons, could these be the remnants of ancient viral infections? You betcha, and it doesn’t stop with plants. In his fascinating book, The Violinist’s Thumb, Sam Kean lays out a compelling argument that most human DNA is actually just viral nucleic acid remnants, much of it being mutated versions of old RNAs.

Old RNA is probably the best way to describe all nucleic acids, because the generally accepted view of the evolution of life on Earth is that everything started with RNA. This called the RNA world hypothesis and professes that the job that DNA does now was first done by RNA.

The hypothesis also says that what those that protein enzymes now do - cutting things up, putting things together, and modifying existing structures - was originally done by RNAs as well, called catalytic RNAs.

We have evidence for this hypothesis, specifically, we know of many RNAs that have enzymatic activity. Called ribozymes (a cross between ribo for RNA, and zyme for enzyme), some RNAs carry out enzymatic roles in our cells and the cells of every eukaryote and prokaryote ever analyzed for their presence.


Ribozymes, a form of catalytic RNA, are present in most cells. They come
in two flavors based on what someone thought their secondary structure
looked like – the hammerhead or the hairpin. Scientists aren’t the most
imaginative when it comes to naming things. They both sit down on an
RNA where they recognize their specific sequence, and make a cut in the
strand. In the cartoon, N stands for any nucleotide, and X stands for
unknown. On the right side is a diagram showing how one ribozyme can
act again and again to cleave RNAs.
So now we are aware of two exceptions when it comes to the central dogma of molecular biology and RNA – 1) RNA can be converted back into DNA and 2) RNA can act like an protein enzyme.

One essential ribozyme function is the synthesis of protein. The ribosome (a riboprotein because it is made up of many RNAs and proteins) translates the codons of mRNA into a sequence of amino acids. It uses the RNA to link the individual amino acids together via peptide bonds. I’d say that’s essential.

Other ribozymes work on themselves. Many mRNAs, when first copied from DNA have sequence within them that is not used in the final product. These are called intervening sequences (or introns), and are cut out (spliced) as part of the transcript processing. Group I and II introns are self-splicing. They fold over on themselves and cause their own excision from the RNA of which they are part!

Group I introns can be found in the mRNAs, rRNAs, and tRNAs of most prokaryotes and lower eukaryotes, but the only place we have found them so far in higher eukaroytes are the introns of plants and the introns of mitochondrial and chloroplasts genomes.  Yet more evidence for the plastid endosymbiosis hypothesis.

If the RNA world hypothesis is to be strengthened, we must find a catalytic RNA that can replicate long strings of RNA “genes.” If RNA was both the storage material and the enzymatic material, there must have been an RNA-dependent, RNA polymerase that was itself a piece of RNA. An RNA replicase has not been found, probably because life moved on to using DNA as the long-term repository of genetic information, But we should be able to make an RNA replicase as a proof of concept.


The RNA world hypothesis is an idea of how early life on Earth transmitted
information and carried out functions. RNA did everything, stored info.,
replicated itself, and carried out enzymatic activity. A – E represent a
possible sequence, although no times can be assigned yet. According to this
theory – the last thing that developed was enzymatic proteins – but new
evidence suggests that proteins were important for the development of
tRNAs so they must have been around earlier. Step B is an area of interest,
as scientists are trying to make an RNA that could replicate any RNA, even itself.
A few ribozymes can polymerize a few nucleotides into short RNAs. The problem is that we need to show that there is an RNA that could replicate long strings of RNA that could then go on to have biological function. Until 2011, the best we’d produced was a ribozyme (called R18) that could polymerize just 14 ribonucleotides.  

Then a study was published showing that a modification of R18 could synthesize much longer strings and could replicate many different RNA templates. In this publication, the authors could synthesize ribonucleic acids of 95 bases, almost as long as the R18 replicase itself. Another study has shown that some catalytic RNAs can self-replicate at an exponential rate, making thousands of copies of themselves while still having catalytic function.

It seems that the RNA hypothesis is getting stronger, but there remain some hurdles.
A July, 2013 study shows that primitive protein enzymes (called urenzymes, where ur = primitive) activate tRNAs much faster than do ribozymes. These primitive proteins date to before the last common ancestor, so they have been around nearly as long as life itself. tRNA urenzymes suggest a tRNA-enzyme co-evolution, providing evidence that catalytic proteins and the conventional central dogma were important in early life – a result that does not support the RNA world hypothesis. I’m glad – the hunt goes on.

In the next weeks, let’s take a look at nucleic acid structures and their building blocks. Think DNA is double stranded? – not always. Think A, C, G, T, and U are the only nucleotides life uses? – not even close.



Chang W, Jääskeläinen M, Li SP, & Schulman AH (2013). BARE Retrotransposons Are Translated and Replicated via Distinct RNA Pools. PloS one, 8 (8) PMID: 23940808

Li L, Francklyn CS, & Carter CW (2013). Aminoacylating Urzymes Challenge the RNA World Hypothesis. The Journal of biological chemistry PMID: 23867455

Ferretti AC, & Joyce GF (2013). Kinetic properties of an RNA enzyme that undergoes self-sustained exponential amplification. Biochemistry, 52 (7), 1227-35 PMID: 23384307


For more information or classroom activities, see:

Nucleic acids –
Central dogma of molecular biology –

Types of RNA –

Retrotransposons –

RNA world hypothesis –

Catalytic RNA (ribozymes) –